Skip to main content

The Math of HIV

A recent conference in Utrecht was a case study on how much computational biology and mathematical modeling can reveal about HIV and its transmission

By Andreas von Bubnoff

In 1995, a team of researchers including David Ho of the Aaron Diamond AIDS Research Center in New York and Alan Perelson of the Los Alamos National Laboratory published a study that detailed how viral load measurements of a patient treated with a protease inhibitor was used to model viral replication under treatment. Their mathematical model showed that infected cells die after no more than two days, and that most viruses in an infected person therefore most likely come from recently infected cells. This, in turn, suggested that HIV replication could be effectively stopped with a combination of antiretroviral drugs that target different parts of the HIV life cycle (Nature 373, 123, 1995).

The prediction turned out to be correct: Combination antiretroviral therapy (cART) revolutionized the treatment of AIDS, bringing some people back from the very brink of death. The study itself “proved to the field that if you collaborate with mathematicians, you can get new information out of your data,” said Rob de Boer of the University of Utrecht and an organizer of the 20th International HIV Dynamics & Evolution conference, held May 8-11 in Utrecht, in the Netherlands. Indeed, he added, it’s precisely this kind of work that has created a community of researchers who regularly attend the annual conference.

With its multidisciplinary assemblage of about 140 scientists—including immunologists, mathematical modelers, virologists, and computational biologists—the Utrecht conference offered up a bracing mix of theoretical and empirical discovery in HIV research. The conference also illustrated how mathematical modeling and computational biology can illuminate everything from the evolution of HIV to the networks through which it is transmitted.

Sequences tell tales

Many talks at the conference focused on the analysis of HIV sequences, including the use of those analyses to expose the networks through which HIV is transmitted between individuals. This is possible because many doctors now routinely sequence the pol genes of HIV in their infected patients. The gene encodes HIV protease and reverse transcriptase, which are two major targets of ART. Such sequencing alerts physicians to the emergence of resistance mutations and allows them to adjust ART regimens accordingly.

These sequences are particularly easy to collect in the UK, since it is one of two countries (the other is Switzerland) that maintain a national database of the sequences. Currently, the UK database contains about 70,000 partial pol sequences, according to Manon Ragonnet-Cronin, who works in Andrew Leigh Brown’s group at the University of Edinburgh.

Ragonnet-Cronin presented an analysis of about 2,000 clade A and 10,000 clade C pol sequences from the UK database, which she conducted to trace the transmission of these viral subtypes. She did this by determining how closely related the sequences are on an evolutionary tree and how long ago potential transmission events occurred.

Ragonnet-Cronin said that clade A and C infections, once primarily contracted abroad by UK residents, are now predominantly transmitted by heterosexual individuals within the UK and have lately become more common. Her analysis suggests that new infections by these subtypes do not primarily come from highly connected individuals (who transmit HIV to many others), but occur randomly, from well connected as well as less well connected individuals. This would suggest that generalized prevention efforts—which don’t target specific groups—are likely to be sufficiently effective in the subtype A- and C-infected heterosexual population.

This is different from a previous analysis of subtype B transmissions, which still dominate in the UK epidemic and occur primarily between men who have sex with men (MSM). In that analysis, Leigh Brown and colleagues showed that MSMs in the UK tend to initially get infected by highly connected individuals. The implication is that, for MSMs, treatment and prevention efforts would be more effective if they were focused on such highly connected people.

Next, Leigh Brown wants to use pol sequence analysis to characterize HIV transmission networks in sub-Saharan Africa, to determine which groups of people HIV treatment and prevention efforts should focus on in that part of the world.

US researchers are also using HIV pol sequences to analyze HIV transmission. Several US states maintain their own databases. One such state is Michigan, where doctors are required to report sequences to the state, according to Erik Volz from the University of Michigan. Volz presented an analysis of 1,217 partial HIV pol sequences collected by the Michigan State Department of Community Health from clade B-infected MSMs.

Volz used sequence similarity and additional data such as incidence, prevalence and stage of infection to show that the epidemic in Michigan is primarily driven by young MSMs who infect other young MSMs. This, he said, belies previous assumptions that the state’s epidemic is mostly driven by older MSMs who get infected through transactional sex with younger MSMs.

Volz said this kind of analysis should make it easier to focus treatment and prevention efforts on the right risk groups. Due to a shortage of personnel, he said, it currently takes weeks to interview a newly diagnosed individual and then notify that person’s partners. “If you could take [the] personnel and focus them just on the people who are more likely to have transmitted in the recent past, you are more likely to find people with acute infection.”

Sanjay Mehta from the University of California in San Diego reported results from a similar study of HIV transmission in the San Diego area. Mehta and his colleagues analyzed over 1,000 mostly partial HIV pol sequences collected between 1996 and 2012, mostly from MSMs within weeks to a few months after infection. ZIP codes revealing where the infected individuals resided were available for 565 of those cases. Using this information, Mehta mapped the epicenter of the HIV epidemic in San Diego to the Hillcrest neighborhood and found, to his surprise, a net influx of infections into Hillcrest: More Hillcrest residents had acquired the virus from people outside than vice versa.

One explanation, Mehta said, is that more aggressive HIV testing in Hillcrest leads to earlier detection and treatment of HIV infection, which makes infected individuals who live in Hillcrest less likely to transmit HIV to others. If true, this would show that aggressive testing has positive effects and can prevent future infections, Mehta said.

Researchers are also using HIV sequences to study its evolution. Samuel Alizon, a researcher from Montpellier, in France, compared many different sequences to show that HIV evolves more quickly within the same host than between different hosts. Until now, he said, this has only been shown for portions of the env gene. But Alizon has found that it is actually the case for the entire HIV genome.

This means that HIV variants that are transmitted to another person aren’t the most highly evolved viruses in that person, and that HIV strains that are less adapted to the host have a transmission advantage. In other words, many HIV variants have adapted so much to their host that they lose some of their ability to infect others. The variants that are eventually transmitted, Alizon said, likely come from latently infected CD4+ T cells, where they were probably “stored” from an earlier stage of infection.

Probing the genome

Genome-wide association studies (GWAS) look for links between genetic variations in the host and differences in the way infected individuals respond to the virus. Mary Carrington from the Frederick National Laboratory for Cancer Research in Frederick, Maryland, uses such analyses to better understand so-called elite controllers—people who naturally keep viral load at undetectable levels for years without treatment. One genetic factor associated with elite control is an allele named B57, which encodes a variant of a human leukocyte antigen (HLA) class I protein that infected cells use to present HIV peptides to CD8+ T cells. It appears that CD8+ T cells kill infected cells more efficiently when they are engaged by the B57 variant.

However, not all infected people who have the B57 allele become elite controllers, suggesting that other genetic factors must be involved in control. To identify them, Carrington and researchers in the lab of David Goldstein at Duke University compared the entire genome sequence of 97 elite controllers who have the B57 allele with the sequence of 90 infected individuals who also have the allele but don’t control viral load.

They found that a gene called KIR3DL1 differed between the two groups. The gene encodes a receptor on the surface of natural killer (NK) cells that recognizes HLA class I molecules on infected cells and then keeps the NK cells from killing the infected cells. The finding suggests that a certain variant of KIR3DL1 synergizes with B57 to control viral load in elite controllers. Carrington said the finding “really solidifies” a previous finding by her group that higher expression of KIR3DL1 synergizes with B57 in the control of viral load (Nat. Genet. 39, 733, 2007).

The mechanism of that control, however, remains unclear. One possibility is that high KIR3DL1 expression on NK cells results in better maintenance of the immune functions of HIV-specific CD4+ helper cells or CD8+ cells. That’s because when KIR3DL1 recognizes HLA class I molecules such as B57 on the surface of HIV-specific CD4+ helper cells or CD8+ effector cells, it may keep the NK cells from killing these CD4+ (or CD8+) cells. This would theoretically result in better control of viral load by maintaining the CD8+ related cellular and/or CD4+ related humoral immune response functions.

While this study involved sequencing of the complete genomes of just 187 people, Carrington is also participating in a much larger, international effort to pool genome-wide data of genetic variations from 6,538 HIV-infected people to be used for GWAS analyses. The project, presented by Paul de Bakker from the University Medical Center in Utrecht, does not use complete genome sequences, but “gene chip” data of the most common genetic variations in their genomes.

“We will be able to perform the largest study [of HIV-infected people] ever performed genome-wide,” de Bakker said, adding that this will make it possible to identify genetic variants associated with disease progression or viral load control at a larger scale than ever before. It’s remarkable, he said, that the HIV research community was able to assemble all their data into one big analysis. “We cannot take that for granted,” he said.

Modeling the Envelope

Like many other proteins, the HIV Envelope (Env), which forms the spikes on the surface of the virus, is extensively modified by complex sugar chains. But determining the structure of Env with the attached sugars is difficult to do by X-ray crystallography, which involves growing a protein crystal and then reading how that crystal scatters X-rays.

That’s why most available structures of Envelope represent the sugarless proteins. But Natasha Wood, who works in the group of Simon Travers at the South African National Bioinformatics Institute, presented what she said was to her knowledge the first molecular dynamics modeling analysis of a part of HIV Env that included the sugar groups. For her calculations, she used software to attach sugars to known positions of gp120. Using the known crystal structure of gp120, Wood modeled the interactions of all of gp120’s 7,500 atoms and roughly 320,000 water molecules surrounding it. It took 64 CPUs 20-24 days to calculate just 30 nanoseconds of the resulting changes in the protein structure.

The analysis focused on the V3 loop, which is thought to determine whether HIV uses the CXCR4 or the CCR5 coreceptor to enter CD4+ T cells, its main target. It showed that with sugar groups attached, the V3 loop of Env is narrower and bent, compared with the V3 of a version of Env that lacks sugar groups. Wood also modeled two proteins, one from a sequence of a virus that has been shown in previous analyses to preferentially use the CXCR4 receptor to infect target cells, with another known to use CCR5 for infection. This showed that the CXCR4 version, which in this case had one more sugar than the CCR5 version, was a little narrower, in keeping with the overall trend that more sugars lead to a narrower shape.

This suggests that the narrowness of the V3 loop may play an important role in coreceptor tropism, Wood said. But to know for sure, additional proteins known to prefer X4 or R5 would have to be modeled to substantiate the results. Wood also plans to model larger portions of Env, and wants to study how sugar groups affect the interaction between Env and neutralizing antibodies.

Calling all post-treatment controllers

The conference wasn’t only about modeling and math. Asier Sáez-Cirión, of the Institut Pasteur in Paris, shared in his keynote address an update of his research on the VISCONTI cohort. This is a group of 14 HIV-infected individuals who started antiretroviral treatment early and, after stopping treatment, turned out to be able to control HIV infection (see Is it Ever Too Early?, IAVI Report, Sep.-Oct. 2012). Since he published his results in March, Sáez-Cirión said he has been contacted by researchers from all over the world, and has learned of 20 additional cases of post-treatment controllers. He said that all such cases worldwide will be collected in an international cohort, the formation of which will be announced at the upcoming International AIDS Society conference in Kuala Lumpur.

Sáez-Cirión argued that his and other studies now strongly indicate that HIV-infected people should start treatment as early as possible. That’s why he was happy, he said, that the U.S. Preventive Services Task Force recently recommended HIV testing of all people between the ages of 15 and 65, regardless of whether they are at a high risk of HIV infection.

In the discussion after the talk, John Coffin of Tufts University in Boston urged Sáez-Cirión to sequence the HIV variants in the VISCONTI individuals, to determine if there is any ongoing evolution (and therefore low-level replication). That would clarify, Coffin said, if these individuals are more similar to naturally occurring elite controllers (who show ongoing viral evolution) or to people on cART, in whom the virus stops evolving.

New hideouts for HIV

The experimental talks also covered research on the HIV reservoir, thought to reside largely in latently infected, resting memory CD4+ T cells, which divide infrequently. But Ben Berkhout of the University of Amsterdam suggested that multiplying, “activated” CD4+ T cells can also be latently infected, and that contact with dendritic cells (DCs) can rouse the virus in those cells (PLoS Pathog. 9, e1003259, 2013).

This means that in most cases, when the virus infects activated CD4+ T cells, it immediately integrates and becomes latent, which means the cells don’t produce the virus. Such latently infected activated CD4+ T cells would thus make up at least part of the reservoir, something that hasn’t previously been appreciated, Berkhout said. This, he noted, could also be the way the latent reservoir in resting memory CD4+ T cells is established, since activated CD4+ T cells can turn into resting cells.

To show this, Berkhout and colleagues infected activated CD4+ T cells in vitro and blocked new virus infections after four hours. They found that when they added DCs, two to fourfold more CD4+ T cells started to produce virus than in the absence of DCs. This suggests that most of the activated CD4+ T cells had become latently infected with HIV, and that contact with DCs can rouse the latent virus in such cells.

DCs therefore might secrete a soluble factor that can purge the latent virus from activated cells, said Berkhout. “We would like to identify the soluble component secreted by dendritic cells that’s doing this,” he said. “If you know that component, you may [be able to] produce it as a recombinant protein and use it as a natural purging agent.” It could, he said, be used together with other drugs that are currently being tested to purge the reservoir, such as histone deacetylase (HDAC) inhibitors, which induce HIV replication in latently infected resting CD4+ T cells. Because the hypothesized factor would be produced by the body, it would probably have fewer side effects than agents such as HDAC inhibitors.

Coffin also reported evidence for an additional source of the HIV reservoir. He presented a case study of an HIV-infected person who became resistant to his cART regimen after 11 years. The case is unusual because sequencing of his HIV RNA revealed not only drug-resistant HIV, but also a “clonal” population with many identical copies of a wild type, drug-sensitive HIV variant. While switching the patient’s treatment to a new cART regimen reduced the level of the resistant variants, it didn’t affect the wild-type variant much, if at all.

Because the patient had oral cancer at the time, it’s possible that the wild-type HIV variants came from infected immune cells that were multiplying as part of an immune response to the cancer, Coffin speculated. This would explain the many identical copies of the wild type virus and why cART didn’t affect it, he said. Because clonal populations of wild type virus are quite common in patients after about five years of treatment, this could mean that multiplying HIV-infected cells are an additional, underappreciated source of persistent viremia in people treated with cART, Coffin said.

Vaccine updates

While vaccines weren’t the main focus of the meeting, there was some discussion of the subject. Hanneke Schuitemaker of the company Crucell gave an overview of current vaccine development efforts at the firm. Crucell has been manufacturing Adenovirus serotype 26 (Ad26) vectors for use in human trials.

As might be expected, one topic that was discussed was the recent termination of the HVTN 505 trial, which was discontinued after it became clear that the DNA/Ad5 prime-boost vaccine regimen it was evaluating did not have any protective effect. The data also revealed a statistically non-significant increase in HIV acquisition among the vaccine recipients compared with placebo recipients. This has raised questions about the future of adenovirus-based vaccination strategies (see IAVI Report blog, HVTN 505: “A hard blow,” April 26, 2013).

Schuitemaker argued that Ad5 should not get all the blame because it appears that the statistically non-significant increase in infections among vaccine recipients was already apparent after the DNA priming. This, she said, suggests that the DNA priming could at least be part of the reason for the non-significant increase. 

Many current HIV vaccine candidates elicit immune responses that the virus easily evades, since the candidates often target HIV proteins—or protein parts—that are not essential to viral survival. As a result, it’s easy for the virus to develop escape mutations that don’t affect its function, said James Mullins of the University of Washington, who described a vaccine approach that seeks to focus immune responses to essential parts of the virus, so that any escape mutations would harm the virus. This way, Mullins said, the immune system is not distracted into mounting immune responses to inessential parts of the virus.

As immunogens for their vaccine, Mullins and colleagues chose conserved parts of the HIV Gag protein, which mostly form hexamers to create the capsid, a shell that encloses the HIV RNA genome. They found that the parts of Gag that are most important for viral function are the interface portions that connect different hexamers. These are also relatively conserved portions of HIV, and immune responses to parts of the Gag protein have been reported in previous studies to be associated with control of viral load.

Mullins and colleagues made a DNA vaccine that contained seven of these conserved Gag regions. They found that priming with this vaccine followed by a boost with a DNA vaccine containing full length Gag elicited very good CD4+ and CD8+ immune responses in rhesus macaques, as well as vigorous antibody responses. The responses, Mullins said, are initially focused on the conserved Gag elements in the prime, and remain focused on these elements after the boost, which further elevates the responses. This is not the case if the vaccinations are done the other way around, with the full length Gag DNA as the prime, followed by the conserved element DNA as the boost, suggesting, Mullins said, that the prime is the part of the vaccination that’s most important to focusing the immune responses.

Because Gag is an internal HIV protein, this Gag-based vaccine approach is unlikely to induce antibodies that prevent infection. But they could reduce viral load, which Mullins wants to check by challenging his vaccinated macaques with SIV. If the vaccine can successfully reduce viral load and focus the immune responses to the conserved elements, then the immunogen “should be the [candidate] Gag component of any vaccine,” Mullins said.

It would also be a good idea to try to make a vaccine that contains a version of Envelope that focuses the immune response only on the essential parts of Env. However, he noted, this would be more difficult to develop, because many conserved parts of Env are not essential to its function, making it more difficult to identify the essential parts of Env for use as immunogens.

Perhaps math modelers will present an answer to that problem as well at some future HIV Dynamics & Evolution conference.