Current and Emerging Themes in the Structural Analysis of Viral RNA Genomes : Applications for the Development of Novel Therapeutic Drugs

RNA molecules assume different roles in many different biological processes. This functional diversity is intimately related to RNA folding. In recent years, advances in the field of structure analysis by highthroughput methodologies and the development of novel potent bioinformatic tools have enabled the first structural maps of the eukaryotic transcriptome and the further establishment of novel function-structure relationships. This important progress has been of special relevance in the field of molecular virology. Viral genomes are compact entities that require overlapping coding levels to bear all the genetic information required for viral propagation. This is achieved by the acquisition of functional RNA domains, structurally conserved encoding units that perform essential roles for the consecution of the viral cycle. Interfering with the activity of these structural elements offers a potential means of treating viral infections, such as that caused by the hepatitis C virus, HCV. This review summarizes major achievements in the development of emerging methodologies for the analysis of RNA folding and their application to the study of the HCV genome structure. It will also examine the progress toward the design of novel antiviral compounds based in nucleic acids able to interfere with the folding of functional RNA domains.


INTRODUCTION
Advances in next generation sequencing techniques have provided a wide overview of the genetic organization in many organisms.These studies have revealed that protein coding genes are just a reduced fraction of the total genetic information, which cannot explain the complexity and variety of biological functions.Therefore, additional functional elements must be encoded in the rest of the genomic nucleotide sequence.The knowledge that RNA molecules can act as regulatory and catalytic elements prompted numerous studies that finally placed RNA as a key partner of the cell survival in all the living organisms [1].As for protein-mediated catalysis, RNA function is also dependent on its three-dimensional (3D) conformation, as observed from X-ray crystallography studies performed with catalytic RNA molecules [2][3][4][5][6].For example, investigations showed that conformational rearrangements were required for the efficient catalysis mediated by these ribozymes [7].
The knowledge about RNA 3D structure has undergone rapid explosive growth since the first crystallographic studies performed with the yeast phenylalanine transfer RNA (tRNA) [8][9][10].Since then, RNA folding started to be considered as a dynamic and hierarchical system defined by primary, secondary, tertiary and quaternary structure, in contrast to proteins.Primary structure, or RNA sequence, is widely used for covariation analysis of different RNA molecules.This is the first step in the identification of conserved structural elements and allows for defining molecular RNA families.RNA secondary structure is defined by base-pair interactions conforming stems that appear joined and closed by different types of loop: apical loops, internal loops, bulges or junction loops [11].These loops exert an effect on the surrounding residues or can act as global controllers of RNA folding by distorting/bending/unwinding long-distant helixes, thus contributing to the overall architecture of the RNA molecule.The tertiary structure motifs act as fixing partners to stabilize the 3D folding.These include coaxial helixes, tetraloop-tetraloop interactions, kissing loops contacts, pseudoknots, ribose zippers, dinucleotide platforms and duplexes stacking, either side by side or orthogonally [12].Finally, quaternary structure relies on intermolecular contacts, which are ultimately governed by secondary and tertiary interactions.As well as this substantial complexity, RNA molecules are dynamic entities, which can adopt a wide range of slightly different conformations to promote distinct biological roles.Therefore, the main question now resides in deciphering the biophysical relationship between the structure and the biological function.For that purpose, biochemical, biophysical and mutational analyses, together with the use of novel bioinformatic strategies, must be combined and applied.These investigations will also contribute to expand our knowledge about key biological processes mediated by RNA molecules, such as transcription elongation, splicing or translation [1].
RNA plasticity acquires special relevance in RNA viruses.They have developed sophisticated genomes, which can store all the information required for their preservation and propagation in a minimum size.This is achieved by overlapping several information coding levels.Besides protein coding sequence, nucleotides also encode structural information, which is transferred into an intricate regulatory network governed by an all-RNA based mechanism.The main achievement of this system is to use a small portion of genetic information to play a number of different roles.These structural regions, the so-called functional RNA domains, provide a robust genetic background [13][14][15], which overcomes the variable and complex viral pool dynamics.It is noteworthy that the same features that achieve proficient viral fitness also provide a potential tool for destroying them.Thus, novel nucleic acidsbased drugs targeting conserved functional genomic domains are now fundamental components in the antivirals toolbox [16].
This review briefly summarizes those biochemical techniques used for the elucidation of RNA folding and comments their application to a practical example, the hepatitis C virus genome.It also reflects the importance of RNA structure as a critical regulatory element for the consecution of the viral cycle and provides an updated overview of the nucleic acidbased compounds targeting genomic functional HCV domains.

STUDYING RNA FOLDING
RNA folding is a dynamic process by which complex architectures can be acquired by combining structural units, as a scaffold.RNA can adopt a number of different conformations, sometimes with roughly equal abundances, thus yielding complex conformational pools.These features complicate the study of the final folding and prompt the use of reliable routes for the structural analysis.
In the initial steps of RNA structure analysis, comparative sequence phylogenetic studies are recommended to predict potential base-pairs [17][18][19][20].This theoretical study informs about the conservation rate of RNA structural elements and is the first choice strategy for determining the secondary structure of a novel RNA.The method is based on the high conservation rate of structural motifs in RNA molecules with sequence homology.This approach requires large sets of highly related RNA sequences for yielding robust information.Sometimes comparative analyses are not possible, as in the case that no homologous sequences are available.In these cases, preliminary studies of the RNA secondary structure can be accomplished by theoretical in silico strategies [21].The most employed softwares for predicting secondary structures from single molecules use the free energy minimization principle.In the equilibrium, a given RNA molecule switches from its "unfolded" state to its folded isoform.Both conformations exhibit a free energy value (Gibbs energy).The change in the Gibbs energy value between them quantifies the likeliness of acquiring such folding.Estimating free energy change values can be made by applying the nearest neighbor model.It focuses on independent base-pairs and their relationships with close nucleotides.Thus, the change in the total free energy for the structure under study can be calculated by adding up the single energies of each base-pair plus those computed for the individual nucleotides composing the loops.This is assessed by applying a dynamic programming algorithm.The main drawback of this theoretical strategy is the assumption that the RNA has a single conformation and that the architecture of every nucleotide will not be affected by distant residues.These proposals may not be true for all the RNA sequences.In such cases, the calculation of the free energy minimization is supplemented with partition functions that reflect the probability of each base pair.This calculation will reflect the accuracy of the prediction and, therefore, it is recommended to be used for all the RNA molecules tested.It is available from the Vienna [22] and RNAstructure [23] packages.
Theoretical in silico studies must be further confirmed by experimental techniques.Though highresolution structural methods, such as nuclear magnetic resonance (NMR) and X-ray crystallography are key supports for structural biology knowledge and must be the ultimate goal in structural research, they present a number of drawbacks.They are laborious and limited in their applicability since they require short RNA molecules with highly compact folding, and therefore they are not useful for determining dynamic architectures in long RNA constructs.Even more, to know about the exact architecture of each atom is not always required for understanding RNA function.Therefore, in most cases, other commonly affordable biochemical and biophysical techniques provide major information to gain insights into the folding mechanism and are invaluable for interpreting high-resolution structure results.In this context, the use of chemical and/or enzymatic reagents that specifically modify or cleave nucleotides in a structure-dependent way is a common approach to elucidate the secondary and tertiary structure of an RNA molecule.The affected residues are scored as reverse transcription stops and resolved by high-resolution gel or capillary electrophoresis [24] (Figure 1A).Reactions are compared to a non-treated experiment and to a dideoxy-sequencing ladder for the identification of each nucleotide position [25,26].
Many different chemical reagents can be used to examine different facets of both secondary and tertiary structure of a given RNA molecule.For example, the solvent accessibility of the RNA backbone can be analyzed by hydroxyl radical footprinting.This is a straightforward and reliable method to detect highorder, closely packed regions within an RNA molecule to give insights into its global architecture [27].Further, backbone flexibility can be monitored by selective 2'hydroxyl acylation analyzed by primer extension methodology (SHAPE) [28][29][30].SHAPE has been Control reactions are performed in the absence of chemical probe.Adducts are detected as stop signals during the RT reaction, thus yielding a pool of cDNA molecules of variable length, which reflects the frequency of each modified position.Adapters required for further sequencing in Illumina equipment are ligated to the 3' end of the cDNA.This ssDNA pool is further subjected to partial amplification with specific primers that include the corresponding code bars and adapter sequences.Sequencing reads are aligned and corrected to calculate the relative frequency of each modified nucleotide to generate the reactivity profile.C) Schematic representation of a general SHAPE-MaP assay.In this case, under specific reaction conditions, chemical adducts induce the miss-incorporation of residues, creating mutations with respect to the original sequence.The resulting cDNAs are then prepared for massive parallel sequencing.Hence, relative reactivity at each nucleotide position can be recorded as absolute mutation frequencies, and then corrected and normalized with respect to a non-treated sample to yield the reactivity pattern.D) Antisense oligonucleotide microarray assays.Under native folding conditions, the input RNA is internally labeled and hybridized with a customized panel of antisense, overlapping DNA oligonucleotides.Differential hybridization ability for each oligonucleotide is directly related to different solvent exposure at the target region.Fluorescent signal is quantified and normalized to render the relative accessibility pattern.
Interestingly, nucleotides involved in tertiary interactions usually adopt stacking geometries or lie at turns in the backbone.These conformations, though relatively "rare", are critical for the establishment of tertiary interactions and required for the RNA molecule to perform its function.The detection of such regions cannot be fully and successfully accomplished by the conventional chemical or SHAPE probes.To overcome this, it has been reported the use of a novel methodology based on SHAPE chemistry, the socalled differential SHAPE (Dif-SHAPE) [38].This strategy makes use of two different SHAPE reagents, N-methylisatoic anhydride (NMIA) and 1-methyl-6nitroisatoic anhydride (1M6) to map flexible residues with slow conformational dynamics as well as nucleotides involved in π-π stacking interactions.In sum, it is very advantageous for the detection of ligands interaction pockets and to identify pre-folded tertiary contacts, thus giving an idea of the overall geometry of a target RNA.
An additional very useful strategy for interrogating about the nucleotides involved in the acquisition of secondary and/or tertiary structures is the molecular interference.In this methodology, the RNA is chemically modified and further partitioned to separate native/functional molecules from non-functional molecular species [39,40].This technique is largely selective for characterizing essential nucleotides groups and for establishing hierarchical relationships between residues required for the acquisition of the 3D folding.It is also a versatile tool since it allows for using different chemical reagents with different specificities [41].
Whatever the chemical probe used, relative reactivity quantification at each nucleotide position is required for inferring the secondary and tertiary RNA structure.Currently, the gold standard to accurately quantify modified nucleotides involves the use of capillary electrophoresis and further processing of the electrophoregram signals.This laborious task can be accomplished by different softwares, such as CAFA [42], FAST [43], SHAPE-CE [44], HiTRACE [45] or the most widely used QuShape package [46].Particularly, QuShape software is a user-friendly platform that allows for relative automation of data processing for accurate, objective and reliable quantification of probing signals, and is currently considered one of the most useful bioinformatic tools for the analysis of RNA structural mapping.
During recent years, the emergence of novel highthroughput technologies that analyze sequence diversity and abundance of specific RNA molecules has prompted the development of improved methods for studying and understanding RNA structure at transcriptome scale.The combination of chemical probing with next-generation sequencing provides conformational data at nucleotide-resolution in complex mixtures of RNA molecules.The SHAPE-Seq strategy [44,47,48] was the first glimpse at aiming that goal.It ensures complete relative-reactivity spectrums at nucleotide-resolution level for hundreds of RNA molecules present in a complex mixture.Modified residues are detected as stops during the reverse transcription reaction, yielding a pool of cDNAs whose length distribution resembles the frequency of modification throughout the molecule (Figure 1B).One of the main drawbacks of SHAPE-Seq is that it quantifies the frequency of RT stops, and thus it is biased by the efficiency of the RNA-adaptor ligation and specific library preparation for deep sequencing.These limits can be overcome by the use of the recently developed SHAPE-MaP (SHAPE and mutational profiling) technology (Figure 1C), in which 2'O-adducts at conformationally flexible residues can be misread by the reverse transcriptase under certain simple reaction conditions [49].SHAPE reactivity is recorded as mutations in the resulting cDNA and double-stranded DNA libraries can be generated by commercially available kits.Finally, an estimation of the mutation frequency at each residue offers a standard SHAPE reactivity profile that can be used to model secondary and tertiary structure for single molecule.This methodology has been further modified and implemented for detecting RNA interaction groups by mutational profiling (RING-MaP) [50].RING-MaP enables the identification of long-distance interactions and provides a conformational map of different 3D ensembles for a single RNA molecule.
Probing structural findings are usually reinforced by mutational analysis and biophysical methods, such as small-angle X-ray scattering or FRET (fluorescence resonance energy transfer).This analysis offers additional clues about the folding and can be used to explore different long-distance interactions or the presence of suboptimal structure pools.
Ultimately, experimental constraints derived from biochemical and biophysical studies can be used to model RNA secondary structure.For example, the RNAstructure package [23] (http://RNA.urmc.rochester.edu/RNAstructure.html)integrates several modules for predicting pseudoknots [51], consensus structures for a set of related sequences [52] or inferring the binding ability of oligonucleotides to different target RNA regions [53,54].
For further prediction of tertiary nucleotide motifs, the MC-Fold and MC-Sym pipeline [55] was designed.This strategy considers the energetic contribution of every residue interaction by defining sets of nucleotide cyclic motifs.The pipeline exhaustively explores the structural space of an RNA molecule and includes modifications to accommodate relative reactivity values derived from different chemical probing methods.This routine enables the prediction of secondary structures with the MC-Fold tool and subsequent three-dimensional modeling with MC-Sym.
Alternatively, the conformational richness shown by RNA molecules can be also a good target for molecular dynamics simulations.During 80's, initial efforts for translating this methodology from proteins to nucleic acids were unsatisfactory and prompted the development of new refined algorithms.These new algorithms enable the incorporation of experimental constraints derived from probing assays with different chemical reagents, thus yielding accurate structural models [41,50].
Finally, evaluating the biochemical ability of RNA target regions to interact with other nucleic acids by sequence complementarity is of great interest for developing specific antisense oligonucleotides or siRNAs directed against long RNA molecules, which are difficult to model in silico.The antisense oligonucleotides microarray technology has been designed to inform about the accessibility of consecutive tracts of nucleotides, giving data about the 3D architecture of the target molecule [56][57][58].It relies in the correlation between the native folding of a target RNA and its ability to interact with different complementary DNA oligonucleotides (Figure 1D).DNA microarrays have been successfully used for the analysis of RNA fragments derived from the human immunodeficiency virus (HIV) [59], the hepatitis C virus (HCV) [26,33,56] or the foot-and-mouth disease virus (FMDV) [60,61].
Together, the combination of biochemical and computational techniques provides complementary and not overlapping structural data of a RNA molecule.Next sections will describe their application to the study of the complex interaction network in the genomic RNA of HCV and its potential as antiviral target.

FUNCTIONALLY ACTIVE STRUCTURAL RNA DOMAINS IN THE HCV GENOME
Hepatitis C virus (HCV) infection is a major global health problem, with more than 3% of the world population affected (WHO data).Patients usually develop fibrosis, cirrhosis and even hepatocellular carcinoma.Current therapies based on the combination of pegylated-interferon (PEG-IFN-α) and Ribavirin are effective in only 40% of patients.Two recently approved direct-acting antivirals (DAAs) against the viral protease NS3, Boceprevir and Telaprevir, have improved the viral sustained response of a considerable number of patients given in cocktails with PEG-IFN-α and ribavirin.However, even under these high pressure conditions, the virus is able to escape by the generation of drug-resistant variants [62,63].Thus, the identification of new targets and the search for fully effective antiviral compounds is a major goal in HCV research.
HCV is an enveloped RNA virus belonging to the Hepaciviruses spp. in the family Flaviviridae, which includes yellow fever virus, bovine diarrhea virus or Dengue virus, among others.Its genome is a positive, ssRNA molecule encoding a single open reading frame (ORF) flanked by highly conserved untranslated regions (UTR) [64][65][66] (Figure 2).The HCV genome is a multifunctional partner that operates, not only as mRNA, but it also actively governs and regulates different steps of the viral cycle in cis.This is achieved by the use of conserved functional RNA domains mainly located at both the 5' and the 3' ends of the viral genome.
In the early infection, viral protein synthesis initiates by directly recruiting the 40S ribosomal subunit.This binding is directed by a highly structured element, which operates as internal ribosome entry site (IRES).Therefore, HCV translation initiation mechanism greatly differs to the canonical capdependent mechanism, which is used by most cellular mRNAs for protein synthesis [67,68].The HCV IRES occurs at the 5'UTR and also spans a short stretch of the core coding sequence (Figure 2) [69][70][71][72][73][74][75].This element compiles all the information that is required for efficient translation initiation encoded in functionally active RNA domains [76].The use of such a strategy minimizes the requirements of host protein factors [77][78][79][80].In addition, structural elements of the 3' end of the viral genome and in the core coding region may also contribute to regulate the IRES function and the subsequent translation elongation step [81][82][83][84][85][86][87].
The architecture of the HCV IRES has been extensively studied by different methodologies, including comparative sequence analysis, thermodynamic-based predictions [88], enzymatic and chemical probing techniques, SHAPE, EMSA (electrophoretic mobility shift assays), NMR, X-ray crystallography (for a review, see [89]) and, more recently, atomic force microscopy (AFM) [90].These analyses have offered a dynamic view of the IRES folding.Under physiological magnesium conditions, the HCV IRES region appears as an extended element with two major domains (II and III), which are organized in a single axis around a complex and tightly compact double-pseudoknot element (PK1 and PK2; Figure 2) [90][91][92], plus a short stem-loop containing the translation start codon (domain IV) [93].The essential domain III contains structural elements that are required for efficient ribosomal recruitment (Figure 2).It consists of six hairpins (designated from IIIa-IIIf) organized around three-and four-way junctions, which constitute the recruitment platforms for eIF3 (junction IIIabc) [94] and the 40S ribosomal subunit (junction IIIdef) [95], with the core 40S binding center placed in the essential subdomain IIId [78,[96][97][98].This subdomain folds into a G-rich stem-loop with a rigid, asymmetric internal E-loop that resembles the sarcin-ricin loop of the 23S rRNA (Figure 2) [26,43,[99][100][101].The phylogenetically conserved GGG triplet in the apical loop adopts a particular U-turn conformation that provides key properties as protein and nucleic acid recruitment center [96,100].
The 3'UTR encompasses three autonomously folded elements (Figure 2): i) the highly variable region, HV; ii) a poly U/UC-rich tract, of variable length and composition; iii) the essential 3'X-tail region.The 3'X-tail is predicted to fold into two theoretical and mutually exclusive conformations with comparable thermodynamic stability (Figure 2) [113].Both architectures preserve the 3'SLI, placed at the very 3' end of the viral genome, while the 55 nts-long upstream region appears as a structurally dynamic element.This element switches from the two stemloops conformer (3'SLIII and 3'SLII) (Figure 2), to a single stem-loop exposing a palindromic nucleotide sequencethe so-called dimer linkage sequence, (DLS)in the apical-loop [114] (Figure 2).The DLS initiates the HCV genome dimerization [113][114][115] by establishing a canonical apical-loop:apical-loop kissing interaction involving two genomic RNA molecules.Then, the interaction can progress toward a more stable extended duplex in the presence of the viral core chaperone protein in vitro [113][114][115].
Translation and replication steps, as well as dimer formation, can be further regulated by conserved stem-loop structures placed at the 3' end of the coding sequence [26,87,[116][117][118][119][120] (Figure 2), which delimit the so-called CRE region (cis-acting replication element; Figure 2).The CRE is composed by three stem-loops: 5BSL3.1,5BSL3.2 and 5BSL3.3[116][117][118][119], The 5BSL3.2 domain (also named SL9266) consists of two GC rich helices connected by an eightbase bulge, and capped by a 12-base apical loop (Figure 2).This 5BSL3.2 has been reported to be critical for efficient HCV replication [118,121] and also regulates viral protein synthesis [87].These actions are mediated by the establishment of long-distance RNA-RNA interactions with other genomic RNA domains (see below) (Figure 2).In addition, the 5BSL3.2domain interacts with viral and host protein factors [122,123] and constitutes a core partner in the regulation of the viral cycle.In contrast, the role of the 5BSL3.1 and 3.3 domains is still unclear [121].

THE RNA-RNA INTERACTION NETWORK TUNES THE FOLDING OF ESSENTIAL FUNCTIONAL RNA DOMAINS IN THE HCV GENOME
The progression of the HCV cycle must be finely regulated to achieve the viral adaptive fitness.Transitions between different steps of the viral cycle must occur at precise time points and in the proper molecular environment.To achieve this level of regulation, dynamic regulatory agents included in the compact structure of the viral genome are encoded.
The 5BSL3.2 stem-loop is the perfect archetype of a multi-functional partner.It recruits essential protein factors for the HCV propagation, such as the viral RNA-dependent RNA polymerase or other host proteins [122,123].Further, it is the core nucleation center of a complex long-distance RNA-RNA interaction network that operates in the HCV RNA genome (Figure 2).By swapping among different contacts with essential structural domains of the viral RNA, 5BSL3.2promotes conformational rearrangements, not only in the directly involved residues and surrounding areas, but also, in an indirect manner, in the rest of partners that compose the network [33].These changes in the global architecture of the genomic RNA would finally promote the switch between different stages of the viral cycle.To accomplish this, the 5BSL3.2domain makes use of its apical loop and the internal bulge (Figure 2): i) The apical loop is complementary to the apical loop of the 3'SLII within the 3'X-tail [118,[124][125][126], folded in the non-dimerizable form.The resulting kissing-loop interaction is required for efficient viral RNA synthesis cell culture [118,124].The structural consequences of this contact are genotype-dependent and include a deep reorganization process in the folding of 3'SLII in genotype 2 viral transcripts, which could be related to the acquisition of a dimerizable conformation [33,124].Alternatively, genotype 1 strains would be refractory to these rearrangements, pointing to an additional role of other closely related and highly conserved RNA structural domains in the HCV genome.
ii) The bulge of the 5BSL3.2domain may swap between two mutually exclusive contacts: one with the apical loop of the IIId subdomain of the IRES region [126,127]; the other with the Alt region placed upstream of the CRE (Figure 2) [119,126].Both interactions are equally probable and show dissociation constants in the same range [125,126].Therefore, choosing between them may be determined by the presence of host and/or viral factors and have important conformational consequences.For example, chemical probing, SHAPE assays and antisense oligonucleotides microarrays revealed that the interaction IIId-5BSL3.2 induces a structural-tuning effect in the directly involved residues and even in the surrounding nucleotides [26,33] in replication competent RNA transcripts representative of genotype 1 strains (Rep construct; Figure 3).The 3D RNA structure prediction of the subdomain IIId for the Rep construct using these experimental constraints renders a model in which residues placed at the apical loop change their orientation to the solvent with respect to those included in a transcript containing the isolated IRES region (construct I; local root-mean square deviation RMSD of 6.84 Å; Figure 3).These reorganization events provide reliable insights into the potential implications for IRES function [26,87].In addition, the IIId-5BSL3.2 contact controls the structural switch at the 3'X-tail to promote the acquisition of the dimerizable form [33], likely by impeding the interaction 5BSL3.2-3'SLII.
Importantly, as a result of the IIId-5BSL3.2 interaction, both ends of the viral genome would be brought into close proximity, rendering a circular topology with important benefits for the virus: promote an increase in the local concentration of essential proteins and cofactors, endorse protection against cellular exonucleases and reduce the spatial distance between different functional domains to enhance the regulation mediated by RNA domains.
All these structural and functional data provided above can be assembled into a biological model to explain the transitions between different steps of the HCV cycle [33] (Figure 4).During viral protein synthesis in the early infection stage, the subdomain IIId of the IRES would be mostly occluded by the translational machinery, thus favoring the contacts Alt-5BSL3.2and 3'SLII-5BSL3.2.Increasing levels of HCV proteins would favor the binding of NS5B and other factors to the 3'X-tail and the 5BSL3.2domain, thus generating a molecular context in which the 5BSL3.2-IIIdand 5BSL3.2-Altinteractions would be equally feasible.Structural rearrangements and transitions from one to another may contribute to the establishment of an enhanced replication state [119] by repressing translation [87].The increase in the viral RNA amount would displace the interaction equilibrium to promote the acquisition of the dimerizable conformer exposing the DLS motif in the presence of the core chaperone protein [115].While dimerization is a common process in the genomes of the Retroviridae family, this is not known to account in the Flaviviridae family within infected cells.In fact, HCV virion is believed to contain a single copy of the genomic RNA.Therefore, HCV genome dimerization would play an unknown role during the infective cycle.It seems likely that future works using next generation highthroughput technologies will help to elucidate this point and to further discover novel features of the HCV folding.3) The newly synthesized RNA molecules harbouring the preferred interaction IIId-5BSL3.2 would expose the dimer linkage sequence (DLS) in an apical loop.In the presence of the core chaperone protein, dimerization is favored in these conformers.4) Alternatively, HCV core protein can be recruited throughout the entire viral monomeric genome to constitute the nucleocapsid particle, which is then enveloped and released to the extracellular medium.Figure adapted from [25].

TARGETING HCV GENOMIC RNA WITH RNA LIGANDS
Replication events in RNA viruses produce a wide spectrum of mutants due to the high error rates of the viral RNA-dependent RNA polymerases.This continuous variation phenomenon impedes the development of efficient antiviral drugs that achieve sustained virological responses.In the case of HCV, the use of therapeutic cocktails containing generic compounds, such as α-interferon and modified nucleotides, provides viral sustained responses only for short periods of time [128].Therefore, designing novel therapeutic strategies and antiviral drugs is a major goal in HCV investigations [129].
From a wide point of view, conserved structural and functional genomic domains are excellent candidates for RNA-targeting due to their high genetic and structural robustness and conservation.Among the multiple antiviral strategies with nucleic acid-based inhibitors, the use of antisense oligonucleotides [16], small interfering RNAs (siRNAs) [130] and aptamers [131,132] has rendered promising results [16].The development of such therapeutics is challenging since nucleic acids must be efficiently targeted, stabilized and delivered to the infected cell.In this context, the great advances in chemical synthesis have successfully addressed some of these drawbacks by the incorporation of modified nucleotides with minimal interference in the desired inhibitory activity.Chemical modifications in the antiviral nucleic acid prevent degradation by exo-and endonucleases and improve its pharmacokinetic and pharmacodynamic properties, while reducing the immunogenicity [133].Modifications include chemical substitutions at the ribose 2' group, such as the inclusion of 2'-O-methyl, 2'-O-fluoro and 2'-O-methoxyethyl groups.
Antisense nucleic acids are oligonucleotides whose sequence is complementary to an existing nucleotide motif in a target RNA.After binding to their target sequence, they can impair key steps in the function of the RNA by competing with protein factors for the interaction site or by disrupting the structure of the target, thus impeding its activity.Different targets can be chosen: the promoter sequences, the translation initiation codon or the intron-exon junctions, among others.Antisense oligonucleotide inhibitors may also be used to interfere with maturation processes or, if DNA-based, to induce target degradation mediated by RNase H [16].
The use of antisense oligonucleotides as anti-HCV agents directed against the RNA genome has been mainly focused on targeting the IRES domain.Of all the assayed inhibitors, ISIS 14803 was shown to be the most promising compound.It is a chemically modified DNA antisense oligonucleotide containing several thioate group substitutions [134,135].It binds to the translation initiation codon, placed in the apical loop of domain IV in the IRES region.Preclinical investigations in cell cultures and mouse models showed its ability to block HCV translation and replication [135].These results encouraged further clinical trials of ISIS 14803, but with low success in phase I [136].Interestingly, during the course of these investigations, the generation of quasispecies affecting the IRES region was analyzed.Although several variations were observed, none affected the complementary residues to ISIS 14803 or their neighboring sequences in domain IV, suggesting strong conservation constraints for this RNA domain [136].
siRNAs are double-stranded, 21-nts long RNA molecules that trigger the cellular RNAi pathway.They operate by direct loading into the RISC (RNA-induced silencing complex), where the guide, antisense strand of the siRNA is specifically selected to target the complementary sequence motif in the viral RNA [130].The degree of complementarity between the guide strand and the target sequence motif determines the silencing mechanism.Under perfect matching conditions, RISC mediates the degradation of the target RNA.In contrast, partial complementarity unleashes the translational repression.Thus, any mismatch will affect to the effectivity of the siRNA.Targeting highly conserved regions, such as the 5' and the 3' UTRs of RNA genomes, is an excellent candidate strategy to minimize the emergence of mutations in the target domain and to improve the efficiency of the siRNA.Several investigations have demonstrated the viability of this approach, rendering efficient HCV inactivation mediated by synthetic siRNAs up to 80% in subgenomic replicon systems [137][138][139][140][141][142].These promising results prompted the development of gene therapy strategies to induce long-term decrease of viral loads.Vector-based short hairpin RNAs (shRNAs) systems have emerged as an attractive strategy.In these constructs, the shRNA molecule is transcribed from a polymerase III promoter and then processed by Dicer in the cytoplasm to yield the mature, active siRNA molecule [143].By using this approach, strong inhibition of HCV replication [138] and reliable decrease in viral titers for genotypes 1a and 2a ex vivo were observed even at concentrations as low as 2.5 nM [144][145][146].The main drawbacks of this alternative are the appearance of escape mutants and the induction of the cellular interferon response pathway [147,148].Human cells transfected with siRNAs show a preferential activation of the Jak/Stat pathway and a general upregulation of the genes whose expression depends on the IFN cascade [148].Other likely limitations include the overproduction of siRNAs, which may collapse the RNAi components used in the regulation of cell processes.
Aptamers are short nucleic acids that efficiently and specifically recognize their target molecule via its 3D architecture.Aptamers are isolated by SELEX (systematic evolution of ligands by exponential enrichment) [149,150].This process consists of a number of iterative cycles applied to a randomized oligonucleotide pool to finally yield an enriched population of molecules with the desired binding ability.Each SELEX cycle includes pool synthesis, binding, positive selection and amplification steps.Aptamers show great specificity, high affinity, easy large-scale production, pharmaceutical flexibility and low immunogenicity, thus constituting a feasible alternative for the development of novel therapeutic compounds.In 2004, the United States FDA approved the first aptamer drug, known as Pegaptanib (commercially available as Macugen, OSI Pharmaceuticals/Pfizer), for the treatment of age-related macular degeneration [151].Other aptamer-based drugs that are currently being tested in clinical trials include antiangiogenic and anticoagulant targets [152].
For HCV, conserved functional domains in the RNA genome have been proved to be effective aptamer targets.One of the preferred genomic functional elements to be targeted by aptamers includes the domain II and the subdomain IIId in the IRES region [153][154][155][156][157], the 3'X-tail [153] and the 5BSL3.2 in the CRE [158].Interfering with the folding and/or function of these regions inhibits translation and/or replication up to 90% levels at aptamer concentrations in the range of nanomolar.The efficacy of these molecules has been confirmed in vitro and in subgenomic replicon systems.
The promising results returned by these investigations served as starting point for the development of chimeric molecules composed by different inhibitory modules with different specific targets and functional properties [157,[159][160][161][162][163].It is tempting to propose that the use of such cocktails compounds would help to minimize the appearance of viral mutants, though to date no results have been provided.

CONCLUSIONS
In the past years, the use of conventional methodologies for the analysis of RNA folding actively contributed for the understanding of the structurefunction relationships in many RNA molecules.Currently, the advent of novel methodologies based on the next generation sequencing techniques for the analysis of RNA folding has complemented the previous picture and has moved toward a scenario in which the RNA is an active and essential cellular regulator partner.This new view has special relevance in RNA viruses, which use RNA folding to encode information required for adaptive viral fitness.Thus, complex and sophisticated control systems governed by RNA elements have evolved in many viral genomes, and likely in the cell transcriptome.These elements show high genetic robustness and are considered excellent targets for novel antiviral nucleic acids-based drugs.This can be a useful strategy to complement conventional antiviral therapies, such as the use of PEG-IFN.Improvements in DNA and RNA synthesis will likely help to develop innovative compounds that achieve sustained therapeutic responses with minimal toxicity and secondary effects.ORF, open reading frame PEG-IFN, polyetilenglicol-interferon, pegylatedinterferon PK, pseudoknot RISC, RNA-induced silencing complex RMSD, local root-mean square deviation SELEX, systematic evolution of ligands by exponential enrichment shRNAs, short hairpin RNAs SHAPE, selective 2'-hydroxyl acylation analyzed by primer extension SHAPE-MaP, selective 2'-hydroxyl acylation analyzed by primer extension and mutational profiling spp., specie ssRNA, single-stranded RNA RING-MaP, RNA interaction groups by mutational profiling siRNAs, small interfering RNAs tRNA, transfer RNA UTR, untranslated region

Figure 1 .
Figure 1.RNA structure mapping.A) RNA folding analysis by chemical probing or SHAPE technology.The RNA is treated with chemical probes that covalently modify nucleotides at specific positions in a structure-dependent manner.Untreated samples must be also included in the assay for background normalization.These modifications, depicted by blue dots, act as stop signals in a reverse transcription (RT) reaction.Fluorescently labeled color-coded primers are used to further map each modified residue.The resulting cDNA products are resolved by automated capillary electrophoresis.Raw data are scaled and normalized to render the relative reactivity values at each nucleotide.B) Overview of the basic SHAPE-Seq experiment.RNA molecules are treated with SHAPE reagents or any other probe able to covalently modify the RNA in a structure-dependent way.Control reactions are performed in the absence of chemical probe.Adducts are detected as stop signals during the RT reaction, thus yielding a pool of cDNA molecules of variable length, which reflects the frequency of each modified position.Adapters required for further sequencing in Illumina equipment are ligated to the 3' end of the cDNA.This ssDNA pool is further subjected to partial amplification with specific primers that include the corresponding code bars and adapter sequences.Sequencing reads are aligned and corrected to calculate the relative frequency of each modified nucleotide to generate the reactivity profile.C) Schematic representation of a general SHAPE-MaP assay.In this case, under specific reaction conditions, chemical adducts induce the miss-incorporation of residues, creating mutations with respect to the original sequence.The resulting cDNAs are then prepared for massive parallel sequencing.Hence, relative reactivity at each nucleotide position can be recorded as absolute mutation frequencies, and then corrected and normalized with respect to a non-treated sample to yield the reactivity pattern.D) Antisense oligonucleotide microarray assays.Under native folding conditions, the input RNA is internally labeled and hybridized with a customized panel of antisense, overlapping DNA oligonucleotides.Differential hybridization ability for each oligonucleotide is directly related to different solvent exposure at the target region.Fluorescent signal is quantified and normalized to render the relative accessibility pattern.

Figure 2 .
Figure 2. Functional domains in the HCV genome.Figure shows the sequence and secondary structure proposed for the 5' and the 3' ends of the HCV genomic RNA, and the long-range RNA-RNA interactions established between distant regions.At the 5' terminus, the minimum region for IRES activity is depicted.The entire 3' end contains the 3'UTR plus the stem-loops 5BSL3.1-5BSL3.3and the Alt sequence motif at the NS5B coding sequence.The 3'X-tail folds into two alternative conformers with distinct functional roles.Dimer linkage sequence (DLS) is shown in grey.Pseudoknot elements are indicated as PK1 and PK2.The translations start and stop codons are shown in bold and marked by arrows.Nucleotide numbering corresponds to HCV Con1 isolate (GenBank accession number AJ238799).

Figure 3 .
Figure 3. RNA structural modeling of subdomain IIId.PDB RNA structure prediction of subdomain IIId using the MC-Fold/MC-Sym pipeline.Relative reactivity values obtained for different chemical reagents were used to generate the 3D models for subdomain IIId in the transcript I, which just contains the IRES region, and the replicative RNA Rep depicted on the top of the figure.Root mean-square deviation (RMSD) value was calculated from the comparison between both models in order to infer differences in the stemloop conformation.Color code: black, residues with a RMSD <3.5 Å with respect to molecule I; orange, nucleotides with a RMSD ranging from 3.5 to 6.0 Å with respect to I; red, residues with a RMSD >6.0 Å.

Figure 4 .
Figure 4. Proposed model for the role of long-range RNA-RNA interactions in the HCV infective cycle.1) In early infection, the naked genomic RNA initiates viral translation by directly recruiting the 40S ribosomal subunit at the subdomain IIId of the IRES region.This impedes the interaction IIId-5BSL3.2 and enhances the conformational rearrangement of the 3' end mediated by the 5BSL3.2domain.2) Viral protein accumulation unleashes the ribosome release from the IRES, while viral and cellular protein factors bind to the 3'SLII.This favors a translationalrepressed state by the establishment of the IIId-5BSL3.2 contact and a favored replication process dependent on the interaction Alt-5BSL3.2.3)The newly synthesized RNA molecules harbouring the preferred interaction IIId-5BSL3.2 would expose the dimer linkage sequence (DLS) in an apical loop.In the presence of the core chaperone protein, dimerization is favored in these conformers.4) Alternatively, HCV core protein can be recruited throughout the entire viral monomeric genome to constitute the nucleocapsid particle, which is then enveloped and released to the extracellular medium.Figure adapted from[25].