Polymorphisms of the core, NS3, and NS5A proteins of hepatitis C virus genotype 1b associate With development of hepatocellular carcinoma

Hepatocellular carcinoma (HCC) is one of the common sequelae of hepatitis C virus (HCV) infection. It remains controversial, however, whether HCV itself plays a direct role in the development of HCC. Although HCV core, NS3, and NS5A proteins were reported to display tumorigenic activities in cell culture and experimental animal systems, their clinical impact on HCC development in humans is still unclear. In this study we investigated sequence polymorphisms in the core protein, NS3, and NS5A of HCV genotype 1b (HCV‐1b) in 49 patients who later developed HCC during a follow‐up of an average of 6.5 years and in 100 patients who did not develop HCC after a 15‐year follow‐up. Sequence analysis revealed that Gln at position 70 of the core protein (core‐Gln70), Tyr at position 1082 plus Gln at 1112 of NS3 (NS3‐Tyr1082/Gln1112), and six or more mutations in the interferon/ribavirin resistance‐determining region of NS5A (NS5A‐IRRDR≥6) were significantly associated with development of HCC. Multivariate analysis identified core‐Gln70, NS3‐Tyr1082/Gln1112, and α‐fetoprotein (AFP) levels (>20 ng/L) as independent factors associated with HCC. Kaplan‐Meier analysis revealed a higher cumulative incidence of HCC for patients infected with HCV isolates with core‐Gln70, NS3‐Tyr1082/Gln1112 or both than for those with non‐(Gln70 plus NS3‐Tyr1082/Gln1112). In most cases, neither the residues at position 70 of the core protein nor positions 1082 and 1112 of the NS3 protein changed during the observation period. Conclusion: HCV isolates with core‐Gln70 and/or NS3‐Tyr1082/Gln1112 are more closely associated with HCC development compared to those with non‐(Gln70 plus NS3‐Tyr1082/Gln1112). (HEPATOLOGY 2013;58:555‐563)

H epatitis C virus (HCV) is a major etiologic agent of chronic hepatitis worldwide, with the estimated number of infected individuals being more than 180 million. Approximately 15% to 20% of chronically infected individuals undergo liver cirrhosis in a decade or so after infection, with hepatocellular carcinoma (HCC) arising from cirrhosis at an estimated rate of 1% to 4% per year. [1][2][3] Several host factors such as male gender, older age, elevated a-fetoprotein (AFP) level, advanced liver fibrosis as well as nonresponsiveness to interferon (IFN) therapy have been reported as important predictors of HCC development. 4,5 Recently, a host genetic factor, i.e., the DEPDC5 locus polymorphism, was reported to be associated with progression to HCC in HCV-infected individuals. 6 On the other hand, it remains controversial as to whether HCV itself plays a direct role in the development of HCC. Experimental data suggest that HCV contributes to HCC by modulating pathways that promote malignant transformation of hepatocytes. HCV core, NS3, and NS5A proteins were shown to be involved in a number of potentially oncogenic pathways in cell culture and experimental animal systems. 7 HCV core protein rendered cultured cells more resistant to apoptosis 8,9 and promoted ras oncogene-mediated transformation. 10,11 Moreover, transgenic mice expressing the HCV core protein in the liver developed HCC. 12 However, the clinical impact of HCV proteins on HCC development in humans and whether all HCV isolates are equally associated with HCC is yet to be determined. In a clinical setting, HCV core protein mutations at positions 70 (Gln 70 ) and/or 91 (Met 91 ) were closely associated with HCC development. [13][14][15][16] Gln 70 and/or Met 91 were also linked to resistance to PEG-IFN/ribavirin (RBV) treatment. [17][18][19][20] In addition, we and other investigators reported that an Nterminal part of the NS3 protein has the capacity to transform NIH3T3 and rat fibroblast cells 21,22 and to render NIH3T3 cells more resistant to DNA damage-induced apoptosis, which is thought to be a prerequisite for malignant transformation of the cell. 23 Also, the NS5A protein is a pleiotropic protein with key roles in both viral RNA replication and modulation of the host cell functions. 24 In particular, the links between NS5A and the IFN responses have been widely discussed. It was proposed initially that sequence variations within a region in NS5A spanning from amino acids (aa) 2209 to 2248, called the IFN sensitivity-determining region (ISDR), were correlated with IFN responsiveness. 25 Subsequently, in the era of PEG-IFN/RBV combination therapy, we identified a new region near the C-terminus of NS5A spanning from aa 2334 to 2379, which we referred to as the IFN/RBV resistance-determining region (IRRDR). 26,27 The degree of sequence variations within the IRRDR was significantly associated with the clinical outcome of PEG-IFN/RBV therapy. In the context of HCC, several retrospective studies suggested that IFN-based therapy might reduce the risk of HCC development. 4,[28][29][30] In an attempt to clarify whether viral factors, in particular those within the core, NS3, and NS5A proteins, are involved in HCC development, we carried out a comparative analysis of the aa sequences obtained from HCV patients who developed HCC and those who did not. In addition, we studied the sequence evolution of these genes in the interval between chronic hepatitis C and HCC development over a period of 15 years.

Patients and Methods
Ethics Statement. The study protocol, which conforms to the provisions of the 1975 Declaration of Helsinki, was approved beforehand by the Ethic Committees in Akashi City Hospital and Kobe University Graduate School of Medicine, and written informed consent was obtained from each patient enrolled in this study.
Patients. A total of 49 HCV-infected patients who developed HCC (HCC group) were retrospectively examined. They were followed up (from 1988 to 2003) with an average period until HCC development being 6.5 6 2.9 years. Paired serum samples at the time of chronic hepatitis C (pre-HCC sample) and HCC development (post-HCC sample) were collected. As a control group, 100 HCV-infected patients who were followed up over a period of 15 years (from 1988 to 2003) without HCC development were retrospectively examined. Serum samples of the control group were available at the time of first visit to the clinic. All patients enrolled in this study were chronically infected with HCV genotype 1b (HCV-1b). HCV subtype was determined as reported previously. 31 Serum HCV RNA titers were quantitated by reversetranscription polymerase chain reaction (RT-PCR0 with an internal RNA standard derived from the 5 0 noncoding region of HCV (Amplicor HCV Monitor test, v. 2.0, Roche Diagnostics, Tokyo, Japan). All patients underwent liver biopsy and were diagnosed as chronic hepatitis. All HCC and 68% (68/100) of non-HCC patients received IFN-monotherapy, either natural IFN alpha (Sumiferon, Dainipponsumitomo Pharmaceutical, Osaka, Japan) at a dose of 6 million units (MU) or recombinant IFN alpha 2b (Intron A; Schering-Plough, Osaka, Japan) at a dose of 10 MU, 3 times a week for 6 months. All HCC patients were nonresponders (NR), who had detectable viremia during the entire course of IFN treatment. On the other hand, 18 (26%) of the 68 non-HCC patients treated with IFN achieved HCV RNA negativity at the end of treatment followed by rebound viremia within 6 months after the treatment and, therefore, they were referred to as relapsers. The other 50 IFN-treated, non-HCC patients were NR. The remaining 32 non-HCC patients did not receive IFN. All patients were seen every 2 months and tested for liver function markers during the follow-up period.
Sequence Analysis of HCV Core, NS3, and NS5A Proteins. HCV RNA was extracted from 140 lL of serum using a commercially available kit (QIAmp viral RNA kit; Qiagen, Tokyo, Japan). The core, NS3, and NS5A regions of the HCV genome were amplified as described elsewhere. 26,[32][33][34] The sequences of the amplified fragments were determined by direct sequencing. The aa sequences were deduced and aligned using GENETYX Win software version 7.0 (GENETYX, Tokyo, Japan). The numbering of aa was according to the polyprotein of the prototype of HCV-1b; HCV-J. 35 Statistical Analysis. Statistical differences in the baseline parameters of HCC and control groups were determined by Student's t test for numerical variables and Fisher's exact probability or chi-square tests for categorical variables. Likewise, statistical differences in viral mutations between HCC and control groups were determined by Fisher's exact probability test. Kaplan-Meier analysis was performed to estimate the cumulative incidence of HCC. The data obtained were evaluated by the log-rank test. Univariate and multivariate logistic analyses were performed to identify variables that independently associated with HCC development. Variables with P < 0.1 in univariate analysis were included in a backward stepwise multivariate logistic regression analysis. The odds ratios and 95% confidence intervals (95% CI) were calculated. All statistical analyses were performed using SPSS v. 16 software (Chicago, IL). Unless otherwise stated, P < 0.05 was considered statistically significant.
Nucleotide Sequence Accession Numbers. The sequence data reported in this article have been deposited in the DDBJ/EMBL/GenBank nucleotide sequence databases with the accession numbers AB719460 through AB719842.

Demographic Characteristics of HCC and Control
Groups. The clinical characteristics of HCC and control groups are shown in Table 1. The HCC group had significantly higher titers of ALT, AST, and AFP, and higher fibrosis staging score than that of the control group. There was no significant difference in viremia titers between the two groups.
Correlation Between Core Protein Sequence Polymorphism and HCC Development. HCV core protein sequences were obtained from all (49/49) and 94% (94/100) of pre-HCC and control patients' sera, respectively. Comparative sequence analysis revealed that 22 (45%) of 49 HCV isolates in the pre-HCC sera (pre-HCC isolates) and 59 (63%) of 94 HCV isolates from the control group (control isolates) had wild-core (Arg 70 /Leu 91 ) ( Table 2). The difference between HCC and control groups was hovering at a statistically significant level (P ¼ 0.05). When the sequence pattern at position 70 alone was examined, a stronger association with HCC was observed. We found that 21 (43%) of 49 pre-HCC isolates had Gln 70 while only 13 (14%) of 94 control isolates did (P ¼ 0.0002). On the other hand, there was no significant correlation between sequence pattern at position 91 and HCC. Thus, a single mutation at position 70 (Gln 70 ) was the only polymorphic factor within core protein that was significantly associated with HCC development. It should be noted that there was no significant correlation between Gln 70 and the degree of fibrosis progression (data not shown).
Correlation Between NS3 Protein Sequence Polymorphism and HCC Development. Sequences of NS3 serine protease domain (aa 1027 to 1146) were obtained from 94% (46/49) and 93% (93/100) of pre-HCC and control isolates, respectively. We found that 29 (63%) of 46 pre-HCC isolates had Tyr and Gln at positions 1082 and 1112, respectively (Tyr 1082 / Gln 1112 ), while 39 (42%) of 93 control isolates did ( Table 2). The difference in the proportion between pre-HCC and control isolates was statistically significant (P ¼ 0.029). On the other hand, there was no significant correlation between Tyr 1082 /Gln 1112 and the degree of fibrosis progression (data not shown).
Correlation Between NS5A Protein Sequence Polymorphism and HCC Development. NS5A protein sequences were obtained from 92% (45/49) and 74% (74/100) of pre-HCC and control isolates, respectively. Twenty-four (53%) of 45 pre-HCC isolates had IRRDR of 6 or more mutations (IRRDR6) while only 15 (20%) of 74 control isolates did ( Table  2; P ¼ 0.0003). We also found that pre-HCC isolates tended to have a higher degree of sequence heterogeneity in ISDR than control isolates, although not statistically significant due probably to the small number of cases examined; 11 (24%) of 45 pre-HCC isolates and 8 (11%) of 74 of control isolates had ISDR with three or more mutations (P ¼ 0.07). Moreover, Asn at position 2218 (Asn 2218 ) within the ISDR was found in 24% (11/45) of pre-HCC isolates and only in 4% (3/ 74) of the control isolates (P ¼ 0.002), suggesting that Asn 2218 is significantly associated with development of HCC.
Cumulative HCC Incidence on the Basis of Core-Gln 70 , NS3-Tyr 1082 /Gln 1112 , NS5A-IRRDR6, and NS5A-Asn 2218 . Follow-up study revealed that the cumulative HCC incidence in patients infected with HCV-1b isolates with core protein of Gln 70 and those of non-Gln 70 , respectively, was 29% and 5% at the end of 5 years, 56% and 23% at the end of 10 years, and 63% and 26% at the end of 15 years (Fig. 1A), with the differences between the two groups being statistically significant (P < 0.0001; Log-rank test). Likewise, the cumulative HCC incidence in patients infected with HCV-1b isolates with NS3 of Tyr 1082 / Gln 1112 and those of non-(Tyr 1082 /Gln 1112 ), respectively, was 15% and 7% at the end of 5 years, 37% and 24% at the end of 10 years, and 45% and 24% at the end of 15 years (P ¼ 0.02) (Fig. 1B). Also, the cumulative HCC incidence in patients infected with HCV-1b isolates of IRRDR6 and those of IRRDR5, respectively, was 18% and 10% at the end of 5 years, 59% and 22% at the end of 10 years, and 63% and 27% at the end of 15 years (P ¼ 0.0002) (Fig. 1C). Similarly, the cumulative HCC incidence in patients infected with HCV-1b isolates of Asn 2218 and those of non-Asn 2218 , respectively, was 31% and 9% at the end of 5 years, 77% and 28% at the end of 10 years, and 77% and 33% at the end of 15 years (P ¼ 0.0003) (Fig. 1D).
Identification of Independent Factors Correlated With HCC Development by Univariate and Multivariate Logistic Regression Analyses. In order to identify significant independent factors associated with HCC development, all available data of baseline patients' parameters and core, NS3, and NS5A polymorphic factors were first analyzed by univariate logistic analysis. This analysis yielded eight factors that were significantly associated with HCC development: core-Gln 70 , NS3-(Tyr 1082 /Gln 1112 ), NS5A-IRRDR6, NS5A-Asn 2218 , increased levels of ALT (>165 IU/L), AST (>65 IU/L), and AFP (>20 ng/L), and fibrosis staging score (3). Subsequently, those eight factors were entered in multivariate logistic regression analysis. This analysis identified two viral factors, core-Gln 70 and NS3-(Tyr 1082 /Gln 1112 ), and a host factor, AFP levels (>20 ng/L), as independent factors associated with HCC development ( Table 3).
Evolution of the Sequences of the Core, NS3, and NS5A Proteins During the Follow-up Period From Chronic Hepatitis to HCC Development. Finally, we investigated sequence evolution of the core protein, NS3 and NS5A (IRRDR and ISDR) during the followup period from chronic hepatitis to HCC development by comparing the sequences between pre-HCC and     IRRDR and ISDR showed a high degree of sequence evolution. IRRDR sequences were different between pre-HCC and post-HCC isolates in 66% (25/38) of cases analyzed (Fig. 3). IRRDR sequences tended to be more polymorphic at the time of HCC occurrence. Frequency of HCV isolates with IRRDR6 was significantly higher in post-HCC isolates than in pre-HCC isolates; IRRDR6 was found in 47% (18/38) of post-HCC isolates compared to 24% (9/38) of pre-HCC isolates (P ¼ 0.03). On the other hand, ISDR3 was found in 21% (8/38) of post-HCC isolates compared to 11% (4/38) of pre-HCC isolates, with the difference between the two groups being not statistically significant (P ¼ 0.3).

Discussion
HCC is one of the common long-term complications of HCV infection. However, whether HCV itself plays a direct role in the development of HCC and whether all HCV isolates are equally associated with HCC development remain to be determined. HCV core, NS3, and NS5A proteins have been reported to affect a wide variety of potentially oncogenic pathways in cell culture and experimental animal systems. 7 In the present study, we demonstrated that HCV isolates with core-Gln 70 , NS3-Tyr 1082 /Gln 1112 or NS5A-IRRDR6 were closely associated with HCC development. In addition, a follow-up study revealed that sequence patterns at position 70 of the core protein and positions 1082 and 1112 of NS3 did not significantly alter during the progression from chronic hepatitis to HCC while NS5A-IRRDR showed a significantly higher degree of sequence heterogeneity in post-HCC than in pre-HCC isolates.
Correlation between polymorphisms at positions 70 and 91 of HCV-1b core protein and IFN-based treatment outcome was extensively studied, especially in a Japanese population. [17][18][19][20] Interestingly, the same mutations were also associated with progression to HCC in the Japanese population with HCV-1b infection. 13 Results obtained in the present study confirmed and emphasized the significant association between the mutation at position 70 (core-Gln 70 ), but not at position 91, and HCC development (Tables 2, 3; Fig.  1A). Despite the clinical evidence that strongly supports the correlation between core-Gln 70 and HCC development, the molecular mechanism underlying this correlation is still obscure. Delhem et al. 36 found that tumor-derived HCV core proteins, but not nontumor-derived ones, interact with and activate doublestranded RNA-dependent protein kinase (protein kinase R or PKR), which might modulate viral persistence and carcinogenesis. Gln 70 was found in two of the three tumor-derived sequences, whereas Arg 70 was found in two of the three nontumor-derived ones.
As for the NS3 protein of HCV, the possible link between an N-terminal portion of NS3 encoding viral serine protease (aa 1027 to 1146) and hepatocarcinogenesis was reported. 21,22 However, information about the relationship between NS3 sequence diversity and HCC development is still limited. We previously reported a significant correlation between predicted secondary structure of an N-terminal portion of NS3 and HCC development. 34 In the present study, we demonstrated that HCV patients infected with HCV isolates with NS3-(Tyr 1082 /Gln 1112 ) were at a higher risk to develop HCC than those infected with HCV isolates with non-Tyr 1082 /Gln 1112 (Tables 2, 3; Fig.  2B). Computer-assisted secondary structure analysis of NS3 revealed that Tyr 1082 was associated with the presence of a turn structure at around position 1083 while Phe 1082 was associated with the absence of the turn structure. 34 Notably, the catalytic triad of NS3 serine protease consists of His 1083 , Asp 1107 , and Ser 1165 . 37 Since positions 1082 and 1112 are in close vicinity of the catalytic triad, sequences diversity at these positions might influence the serine protease activity and also pathogenicity of HCV. Large-scale, multicenter clinical studies as well as more detailed experimental studies at the molecular and cellular levels are needed to clarify the importance of sequence diversity at positions 1082 and 1112 of NS3 in HCVmediated hepatocarcinogenesis.
HCV heterogeneity in NS5A-ISDR and NS5A-IRRDR are correlated with IFN-responsiveness. 17,18,25,26 As IFN-based therapy reduces the risk of HCC development, 4,28-30 we were interested to investigate whether there is a correlation between sequence heterogeneity in NS5A and development of HCC. Our present results revealed that a high degree of sequence heterogeneity in IRRDR (IRRDR6) was closely associated with HCC development (Table 2). We previously reported that IRRDR6 was significantly associated with good responses to PEG-IFN/ RBV combination therapy. 26,27 These results collectively suggest that oncogenic properties and PEG-IFN/ RBV responsiveness are independent viral characteristics and that PEG-IFN/RBV therapy helps eliminate oncogenic HCV isolates, thus reducing the risk of HCC development.
Position 2218 of NS5A, located within ISDR, appears to tolerate a wide range of aa substitutions as observed in different HCV-1b isolates. 25,38,39 Interestingly, Asn at position 2218 (Asn 2218 ) was detected significantly more frequently in pre-HCC isolates than in the control isolates. Further studies are needed to determine the possible importance of this residue in hepatocarcinogenesis.
Another focus of attention is how the sequences of the core protein, NS3, and NS5A-IRRDR evolve during the interval between chronic hepatitis and HCC development. One of the significant advantages of the present study was that we could conduct a longitudinal investigation by analyzing the target sequences of preand post-HCC isolates. We found that core-Gln 70 and NS3-(Tyr 1082 /Gln 1112 ) were well conserved in each paired sample. This indicates that core-Gln 70 and NS3-(Tyr 1082 /Gln 1112 ) were already present before the development of HCC. Non-Gln 70 of the core protein and non-Tyr 1082 and non-Gln 1112 of NS3 were also well conserved in each paired sample. These results imply the possibility that these sequence patterns were not a result of HCC but, rather, they were a possible causative factor for the development of HCC. We hypothesize, therefore, that HCV isolates with core-Gln 70 and/or NS3-(Tyr 1082 /Gln 1112 ) are highly oncogenic, whereas those with non-(Gln 70 plus NS3-Tyr 1082 /Gln 1112 ) are less oncogenic. It is not clear yet as to whether these oncogenic mutations were present from the very beginning of HCV infection or if they emerged at a certain timepoint (before the initiation of follow-up) during the longterm persistence through an adaptive viral evolution in the host. More comprehensive follow-up study is needed to address this issue. In any case, the core-Gln 70 and NS3-(Tyr 1082 /Gln 1112 ) would be considered an index for prediction of HCC development. On the other hand, IRRDR in NS5A is more tolerant for sequence evolution. IRRDR in post-HCC isolates showed a significantly higher degree of sequence heterogeneity compared with that in pre-HCC isolates. This observation suggests that IRRDR is under strong selective pressure during the course of HCV infection and that the high degree of IRRDR heterogeneity (IRRDR6) in HCV isolates from patients with HCC may not be a causative factor for development of HCC.
In conclusion, the present results suggest the possibility that patients infected with HCV isolates with core-Gln 70 and/or NS3-(Tyr 1082 /Gln 1112 ) are at a higher risk to develop HCC compared to those with non-(Gln 70 plus NS3-Tyr 1082 /Gln 1112 ).