In vivo gene editing of human hematopoietic stem and progenitor cells using envelope-engineered virus-like particles

Cloning of constructs

Amino acid sequences for VSV-G, BaEVTR31 and BaEVTR_MACA variant were codon-optimized, synthesized (ATUM or Twist) and cloned into pcDNA3.1 (VSV-G) or pTwist-CMV-β-globin (BaEVTR) plasmid backbones. Nipah G and F amino acid sequences were similarly codon-optimized and cloned into pCAGGS backbone. Blinded Nipah G protein (four point mutations in head) was engineered with binders through a flexible 3×G4S linker at the C terminus. G and F proteins also incorporated cytoplasmic tail modifications to enable pseudotyping35. Single-domain antibodies targeting huCD133 were obtained from the literature. Anti-CD133 scFv sequences were obtained from the literature and cloned into the blinded Nipah G protein C terminus (ATUM). The RetroNectin CoD sequence was determined by screening a library of constructs with combinatorial optimizations of the retronectin CS-1 and C-domains, transmembrane anchor domains and linkers between the different segments. These amino acid sequences were codon-optimized and cloned into pTwist-CMV-β-globin plasmid backbone. The lead sequence was selected on the basis of the highest titers on resting CD34+ cells. When Nipah G and F proteins were used in the context of retargeting to the huCD133 receptor, additional G cytoplasmic tail and linker modifications and F modifications were screened for improved on-target titers with this composition. Briefly, for the Nipah G modifications, cytoplasmic tails from related paramyxoviruses or from unrelated viruses that were identified to pseudotype retroviral vectors with high titers were synthesized and cloned into the G protein along with diversification of the linker between the G head and the binder (ATUM). Linker length and flexibility were varied according to sequences in the literature. The Nipah G variant library was screened for improvements in on-target versus off-target cell line titers. The top 20 variants were then screened in combination with the top Nipah F variants that were separately screened in the same manner. Lead G and F combinations were then made at larger scale and concentrated, followed by testing on primary CD34+ cells in vitro.

Cell cultures

HEK-293T Lenti-X cells were obtained from Takara (632180) and maintained in DMEM (Thermo Fisher Scientific, 10569-044) with 10% FBS (Thermo Fisher Scientific, 10437028). Suspension HEK-293T clones were generated and screened for highest titer production of BaEVTR LV vectors or MLV Gag–Cas9 VLP vectors. Briefly, the parental HEK-293T clone was obtained from the American Type Culture Collection (ATCC; CRL-3216). After adaptation to suspension culture and growth in serum-free medium, the top clones were tested for the production of LV vectors. The top ten clones with the highest p24 concentration in pg ml−1 and functional titer were then furthered screened for the highest functional titer of BaEVTR LV vectors or BaEVTR MLV VLP vectors. The top clone (SCC4180) was then further used for production and maintained in LV-MAX medium (Thermo Fisher Scientific, A4124004) when culturing. HEK-293T Lenti-X cells were engineered to overexpress huCD133 by standard LV transduction. Briefly, transfer plasmids were constructed to express full-length codon-optimized huCD133 (NP_006008.1) coding sequences, followed by an IRES–puromycin cassette. VSV-G-pseudotyped LV vectors were produced with transfer plasmid and HEK-293T Lenti-X cells were transduced at multiple vector doses. Then, 3 days after transduction, cells were treated with 1 μg ml−1 puromycin and assessed for huCD133 expression by FC after ten more days. Conditions with the appropriate expression levels were expanded and clonally sorted into 96-well plates. Clones were grown for 2–3 weeks and checked for expression stability by FC. Clones that maintained a high and tight expression level (HEK CD133 overexpression), with minimal silencing, were chosen for future experiments.

LV and VLP production

For LV vectors produced from adherent 293T Lenti-X cells, fusogen plasmids were combined with psPAX2 and a transgene vector containing enhanced GFP under the control of an SFFV promoter in the pSF backbone at a 0.2:0.3:0.5 ratio of fusogen, packaging and transgene, respectively. Plasmids were complexed with TransIT-293 transfection reagent (Mirus Bio, MIR 2706) and added to producer cells that were seeded the day prior, reaching 60–80% confluency at the time of transfection. The next day, a full medium exchange was performed. On the second day after transfection, the cell supernatant was collected and centrifuged at 1,000g for 10 min to pellet any producer cells. After passing through a 0.45-µm filter, crude material was aliquoted and frozen at −80 °C. For 100× or 300× concentrations, vectors were concentrated by 90-min ultracentrifugation over a sucrose cushion, followed by resuspension in formulation buffer. Concentrated LV vectors were aliquoted and frozen at −80 °C for later use. For LV vectors produced from suspension 293T clones, fusogen plasmids were combined with pALD-GagPol and pALD-Rev packaging plasmids (Aldevron), along with the pSF-SFFV-GFP transgene plasmid, at a 0.16:0.21:0.11:0.52 ratio of fusogen, Rev, GagPol and transgene, respectively. Plasmids were complexed with PEIpro transfection reagent (Polyplus/Sartorius) and added to producer cells that were seeded at 2.5 × 106 cells per ml 1–2 h before transfection. For Nipah fusogens, the G and F plasmid ratio was optimized for the highest titer depending on the targeting binder. The next day, 5 mM sodium butyrate was added to the culture. On the second day after transfection, cell supernatant was harvested as described above and crude or concentrated LV vectors were obtained. For VLPs produced from suspension 293T clones, fusogen plasmids were combined with MLV-based or HIV-based GagPol and Gag–Cas9 plasmids, along with the gRNA plasmid at a 0.475:0.36:0.115:0.05 ratio of gRNA, MLV GagPol, MLV Gag–Cas9 and BaEVTR, respectively, or 0.364:0.238:0.238:0.128:0.032 ratio of gRNA, HIV GagPol, HIV Gag–Cas9, Nipah F and Nipah G, respectively. The Gag–Cas9 fusion strategy was adapted from a previously described protocol22,24. For 100× or 300× concentrations of LV or VLP vectors, freshly harvested crude supernatant was concentrated by 90-min ultracentrifugation or overnight (20–24 h) centrifugation at 4,100g over a sucrose cushion, followed by resuspension in formulation buffer. Concentrated LV vectors were aliquoted and frozen at −80 °C for later use. All absolute particle calculations reported in the manuscript for LVs and VLPs were performed on the basis of the results of ELISA assays for p24 (QuickTiter lentivirus titer kit, lentivirus-associated HIV p24) and p30 (QuickTiter MuLV core antigen ELISA kit), respectively.

Titer determination on cell lines and MOI calculation

To determine functional titer (TU per ml for LVs or EU per ml for VLPs), cell lines were seeded in 96-well tissue culture-treated plates (30,000 cells per well for all HEK-293T lines) and then crude or concentrated vectors were serial diluted onto the cells 2–4 h after cell seeding. In the case of NiV-CD133 constructs, titers were determined on the HEK CD133 overexpression cell line (generated as described above) because of the low permissiveness for transduction by CD133-targeted vectors of the parental HEK-293T cell line, which expresses low endogenous levels of huCD133. For LVs, cells were analyzed for GFP 3–4 days after vector addition. For VLPs, cells were analyzed for B2M knockout 6–7 days after vector addition. GFP and B2M were both analyzed by FC. Cells were harvested from the 96-well plates, washed and incubated with Draq7 viability dye to determine live cells before assessing GFP expression. For B2M readout, cells were blocked in cell stain buffer (Biolegend, 420201) and stained with anti-B2M antibody (1:100; Biolegend, 316306). After washing, cells were incubated with Draq7 and analyzed by FC. Functional titers were calculated from the vector dilution at which the percentage GFP+ or B2M values were in the 5–20% range. When LVs or VLPs were dosed on the basis of MOI, specific calculations were used. The cell line titer that was determined above was used to calculate the total number of TUs in the given volumetric dose of vector to the cells in vitro. The total number of TUs applied was divided by the number of cells in the well at the time of vector delivery to determine the MOI.

In vitro transductions experiments

Cryopreserved granulocyte colony-stimulating factor (G-CSF) and Plerixafor-mobilized PB-derived CD34+ cells (Allcells) were thawed in StemSpan SFEM II medium (StemCell Technologies, 09065) supplemented with Glutamax (Thermo Fisher, 35050061) and penicillin–streptomycin (PS; Thermo Fisher, 15140122). Then, 100 μl of cells were plated in a round-bottom 96-well plate at 2 × 105 viable cells per mL. After overnight rest in a 37 °C incubator, 100 μl of diluted vector + vectofusin (Miltenyi Biotec, 130-111-163) was added at a final concentration of 12 μg ml−1. After 24 h of transduction, StemSpan SFEM II medium containing 2× CC110 cytokine cocktail (StemCell Technologies, 02697) was added. After 6–7 days in culture with cytokines, we assessed the degree of transduction by GFP expression on a BD Fortessa instrument. The CC110 cytokine cocktail was excluded from media in the experimental groups described in Fig. 1c,d. To assess the off-target risk, cells were seeded at 100 µl according to the number from Supplementary Table 1 in flat-bottom 96-well plates before transduction. The vectors were normalized to receive 1 TU per cell on the basis of their titers on CD34+ primary cells. Vectors were prediluted on culture medium and vectofusin (Milteny Biotec, 130-111-163) was added to them to a final concentration of 12 µg ml−1. Then, 100 µl of the vector–vectofusin mix was added to the cells. Human PB pan-T (StemCell Technologies, 70024) and CD34+ cells were tested for transduction efficiency on their activated versus unactivated (resting) phase. The activated pan-T cells received interleukin 2 (IL-2) cytokine (StemCell Technologies, 78036.2) and CD3/CD28 activator beads (StemCell Technologies, 10991), whereas the activated CD34+ cells received a CC110 cytokine cocktail (StemCell Technologies, 02697) before transduction. The resting arms were activated 24 h after transduction with the same activation cocktail. A medium change was performed on day 4. The analysis of the transduction efficiency was performed by FC and VCN assay on day 7 after transduction. SupT1 (ATCC, CRL-1942), Raji (ATCC, CCL-86) and KG1A (ATCC, CCL-246.1) cell lines were cultured in RPMI medium + 10% FBS. HEK-293 Lenti-X cell lines were cultured in DMEM + 10% FBS. Caco-2 cell lines (ATCC, HTB-37) were cultured in EMEM + 10% FBS. Hulec-5a cell lines (ATCC, CRL-3244) were cultured in MCDB131 medium, without L-glutamine, supplemented with 10 ng ml−1 epidermal growth factor, 1 μg ml−1 hydrocortisone, 10 mM glutamine and 10% FBS. Human umbilical vein endothelial cells (HUVECs; ATCC, CRL-1730) were cultured in Vasculife medium + vasculife supplement. For primary cell culture, activated and resting pan-T cells were cultured in Immunocult supplemented with 10 ng ml−1 IL-2 and PS. Human pulmonary alveolar epithelial cells (ScienCell, 3200) were cultured on lysine-coated plates in alveolar (AEpiCM) medium supplemented with FBS, endothelial cell growth supplement (ECGS) and PS. Human renal epithelial cells (ScienCell, 4120) were cultured on lysine-coated plates in EpiCM medium supplemented with FBS, ECGS and PS. Human lymphatic endothelial cells were cultured on fibronectin-coated plates in endothelial cell medium supplemented with FBS, ECGS and PS. Human aortic endothelial cells (ATCC, PCS-100-011), human pulmonary artery endothelial cells (ATCC, PCS-100-022) and human coronary artery endothelial cells (ATCC, PCS-100-020) were cultured in Vasculife medium + supplement. PHHs (Lonza, HUCPI) were cultured in hepatocyte basal medium + supplement.

FC and FACS sorting

For in vivo studies, we used two FC panels. For the lineage-positive panel, cells were washed in PBS, incubated with Human TruStain FcX receptor block (Biolegend, 422302; 1:20 in BSA stain buffer (BD Biosciences, 554657)) and stained with an antibody cocktail consisting of CD34 PE (BD Biosciences, 348057), CD15 AF700 (Biolegend, 323026), CD14 APC (Biolegend, 367118), CD3 BV421 (Biolegend, 344834), CD19 PE/Cy7 (Biolegend, 302216), CD33 BV650 (Biolegend, 303430), CD13 BV650 (BD Biosciences, 740567), huCD45 BV711 (Biolegend, 304050), msCD45 BV605 (Biolegend, 103140) and CD41 PE/Dazzle 594 (Biolegend, 303732). Immediately before running the samples on FC, cells were stained with 7-AAD (Biolegend 420404). For the lineage-negative panel, cells were washed in PBS and stained with live–dead fixable near-infrared dead cell stain kit for 633-nm or 635-nm excitation (Invitrogen, L10119). Subsequently, cells were incubated with Human TruStain FcX receptor block (Biolegend, 422302; 1:20 in BSA stain buffer (BD Biosciences, 554657)). The lineage-negative antibody cocktail consisted of the following fluorescent antibodies: huCD45 PE/Cy7 (Biolegend, 304016), msCD45 BV605 (Biolegend, 103140), CD34 APC (BD Biosciences, 345804), CD38 PE/Cy5 (Biolegend, 103149), Pacific Blue anti-human lineage cocktail (CD3, CD14, CD16, CD19, CD20 and CD56; Biolegend, 348805), CD15 Pacific Blue (Biolegend, 980504), CD90 PE/CF594 (BD Biosciences, 562385), CD45RA BV711 (BD Biosciences, 740806), CD7 AF700 (BD Biosciences, 561603), CD10 BV785 (Biolegend, 312238), CD133 PE (Biolegend, 372804) and CD117 BV650 (Biolegend, 313222). For the detection of B2M cells upon in vitro titration experiments, we used a B2M PE antibody (Biolegend, 316306). For in vivo readouts, we added to both panels a B2M FITC antibody (Biolegend, 316304). For FACS sorting of CD34+ cell subtypes, approximately 106 total CD34+ cells (StemCell Technologies, 70008) were stained in an antibody cocktail containing CD34 APC (BD Biosciences, 340441), lineage cocktail Pacific Blue (Biolegend, 348805), CD15 Pacific Blue (Biolegend, 980504), CD38 PE/Cy5 (Biolegend, 303508), CD90 PE/CF594 (BD Biosciences, 562385), CD45RA BV711 (BD Biosciences, 740806), CD7 AF700 (BD Biosciences, 561603), CD10 BV605 (Biolegend, 312222), CD135 PE (Biolegend, 313306) and DAPI (Thermo Fisher, D1306). Cells were sorted into various HSPC subpopulations using a BD Symphony S6 cell sorter.

In vivo experiments in NBSGW mice

All husbandry and experimental procedures were approved by the Institutional Animal Care and Use Committee of the Charles River Accelerator and Development Lab. NBSGW mice (Jackson Labs, strain 026622) were used. Mouse age at the time of study start was 6–9 weeks. On the day of cell dosing, huCD34+ cells (double-mobilized by 100 μg ml−1 G-CSF and 5 mg kg−1 AMD3100) were thawed from cryovials stored in liquid nitrogen, washed with sterile PBS and resuspended at 300,000–500,000 cells in 200 μl of PBS. For the PBMC spike-in study, 500,000 CD34+ cells and 107 PBMCs from the same donor were mixed in 200 μl of PBS and dosed IV. On the day of LV or VLP dosing, LV or VLP was thawed and kept on ice. LV or VLP was premixed with vectofusin (Miltenyi Biotec, 130-111-163) at a final concentration of 12 µg ml−1 exactly 10 min before dosing in vivo. At study completion, mice were anesthetized using CO2 and organs were collected for processing and downstream FC and histopathology analysis. Tissues were processed as follows. For PB, 75–100 µl of whole mouse blood was added to a V-bottom 96-well plate. Then, 75–100 µl of PBS (Gibco, 10010-023; 1:1 ratio with the blood) was added to the wells containing blood samples. Samples were pipetted up and down to mix the blood and PBS. The 96-well plate containing the blood–PBS mixture was spun at 1,000g for 2 min and the supernatant was discarded. Next, 200 µl of ACK lysis buffer (Gibco, A10492-01) was added to each well that contained a blood pellet. The pellets were mixed well in the ACK buffer by pipetting and incubated at room temperature for 3 min. Cells were spun again at 1,000g for 2 min and the supernatant was discarded. Next, 200 µl of 1× PBS (Gibco, 10010-023) was added to each well. Cells were spun again at 1,000g for 2 min and the supernatant was discarded. Then, 200 µl of ACK lysis buffer (Gibco, A10492-01) was added to each well a second time. The pellets were mixed well in the ACK buffer by pipetting and incubated at room temperature for 3 min. Cells were spun again at 1,000g for 2 min and the supernatant was discarded. Next, 200 µl of 1× PBS (Gibco, 10010-023) was added to each well. Cells were spun again at 1,000g for 2 min, the supernatant was discarded and cells were resuspended in 1× PBS (Gibco, 10010-023) and transferred over to a 96-well U-bottom plate for staining for FC. For BM, muscle and residue tissue was removed from the femur and tibia with sterile forceps and scissors. Bones were transferred into a sterile mortar containing 5 ml of ice-cold 1× PBS (Gibco, 10010-023) and smashed with a pestle. The crushed bones were filtered through a 40-µm nylon cell strainer to remove solid fragments, the volume of the filtrate was brought up to 10 ml of PBS (Gibco, 10010-023) and the filtrate was centrifuged at 300g for 10 min. Following centrifugation, the supernatant was removed, the cell pellet was resuspended in 10 ml of 1× PBS (Gibco, 10010-023) and filtered through a 40-µm nylon cell strainer. The filtrate was centrifuged for 5 min at 350g (4 °C). Following centrifugation, the supernatant was removed and cells were resuspended in 1× PBS for staining for FC. Spleens were isolated from mice and stored on ice in 1× PBS (Gibco, 10010-023). To start tissue processing, spleens were placed into 2 ml of ACK lysis buffer (Gibco, A10492-01). Spleens were dissociated by repeatedly crushing between two frosted sides of a glass microscope slide. The dissociated splenic tissue was incubated at room temperature in 2 ml of ACK lysis buffer (Gibco, A10492-01) for 10 min. The dissociated splenic tissues were then filtered through a 40-µm nylon cell strainer to remove solid fragments, the volume of the filtrate was brought up to 10 ml in PBS (Gibco, 10010-023) and the filtrate was centrifuged at 350g for 5 min. Following centrifugation, the supernatant was removed and the cell pellet was resuspended in 10 ml of 1× PBS (Gibco, 10010-023) and filtered through a 40-µm nylon cell strainer. The filtrate was centrifuged for 5 min at 350g (at 4 °C). Following centrifugation, the supernatant was removed and cells were resuspended in 1× PBS for staining for FC.

In vivo experiment in FRG mice

The FRG humanized liver mouse model was used to assess vector potential to transduce human hepatocytes in vivo. Mice were maintained and the study was conducted by Yecuris. FRG mice have an FAH−/− mutation, which enables ablation of the murine hepatocytes, creating a niche for engraftment of donor hepatocytes. They also have Rag2−/−Il2rg−/− immunodeficiency mutations, which allow persistence and expansion of engrafted human hepatocytes. The NPCs of the liver (Kupffer cells and endothelial cells) remained of murine origin. After confirmation of >80% humanization by human albumin levels (>3.5 mg ml−1), mice were dosed with ~300× concentrated vector at a 5 ml kg−1 dose. Unless otherwise noted, mice were killed 14 days after injection and livers were isolated for tissue dissociation and FC analysis of the human and murine cell populations. For FC analysis, hepatocytes were stained using hASGR1 AF647 (R&D Systems, FAB43941R, clone 950203), mCD81 PE (Biolegend, 104906, clone Eat-2) and eBioscience eFluor780 viability dye (Thermo Fisher). NPCs were stained with F40/80 PE (Biolegend, 123110, clone BM8), CD31 APC (Biolegend, 102410, clone 390) and eBioscience eFluor780 viability dye (Thermo Fisher).

VCN analysis

Cells were washed with PBS once to get rid of the media. Suspension cells were centrifuged at 500g for 5 min and the supernatant was discarded before washing. DNA was extracted using QuickExtract DNA extraction solution (BioSearch Technologies, QE0905T) by incubating the cells with solution at 65 °C for 6 min, followed by 98 °C for 2 min. They were then diluted at least threefold with ultrapure water to avoid the inhibition of PCR because of the abundant presence of the extraction solution. A PCR master mix was prepared at a final volume of 20 µl containing ddPCR Supermix for Probes, no UTP (Bio-Rad, 186-3024), delU3 primer/probe mix (DelU3 forward primer, 5′-GGAAGGGCTAATTCACTCCC-3′; DelU3 reverse primer, 5′-GGTTTCCCTTTCGCTTTCAGG-3′; DelU3 probe, 5′-/56-FAM/TGCCCGTCTGTTGTGTGACTCTG/3IABxFQ/-3′) and ArX primer/probe mix (ArX forward primer, 5′-TATGTTCAGATGCCCATTAGGG-3′; ArX reverse primer, 5′-CTTGCTCAAAGGACTGTGATTTC-3′; ArX probe, 5HEX/AGTGCCTTT/ZEN/CAGATGGAAACGGGT/3IABkFQ/). Then, 5 µl of the diluted DNA was added to the PCR master mix. A Bio-Rad QX200 AutoDG droplet generator was used to generate droplets. The droplets were then subjected to PCR at 95 °C for 10 min, with 40 cycles at 95 °C for 10 s, followed by 60 °C for 60 s and 98 °C for 10 min. The PCR-amplified material was then read on the droplet reader.

HTS analysis of genome editing

Genomic DNA was purified using QIAamp DNA micro kit (Qiagen, 56304) according to the manufacturer’s manual. Purified genomic DNA was first amplified with primer pairs spanning around 200–300 bp at the target edited regions using Q5 hot start high-fidelity 2× master mix (Qiagen, M0494L) for 35 cycles (98 °C for 60 s, followed by 35 cycles of 98 °C for 10 s, 66 °C for 20 s and 72 °C for 30 s, with a final step of 72 °C for 2 min, before holding at 4 °C). Second amplifications were performed with indexed primer pairs using Q5® hot start high-fidelity 2× master mix (Qiagen, M0494L) for 12 cycles (98 °C for 60 s, followed by 12 cycles of 98 °C for 10 s, 65 °C for 20 s and 72 °C for 30 s, with a final step of 72 °C for 2 min, before holding at 4 °C). Pooled indexed libraries were cleaned up by AMPure XP solid-phase reversible immobilization reagent (Beckman Coulter, A63881) and measured with a 4150 Tapstation system (Agilent) using D5000 DNA ScreenTape assays (5067-5588/89, Agilent) for size range and a Qubit 4 fluorometer (Thermo Fisher Scientific) with high-sensitivity dsDNA assays (Thermo Fisher Scientific, Q33231) for quantification before clustering on Illumina sequencers. Forward and reverse primers complementary to sequences upstream and downstream of the region of interest were designed with 5′–3′ overhang adaptors. A subsequent limited-cycle amplification step was performed to add multiplexing indices and Illumina sequencing adaptors. An amplicon of ~200 bp was amplified by the primer pair. Libraries were normalized, pooled and sequenced on an Illumina MiSeq or NextSeq depending on throughput required and availability: forward overhang, 5′ CTCTTTCCCTACACGACGCTCTTCCGATCT-[locus-specific seq]; reverse overhang, 5′ CTGGAGTTCAGACGTGTGCTCTTCCGATCT-[locus-specific seq] (additional details in Supplementary Information Table 2). The amplicon sequencing data were used to characterize and quantify the nuclease editing activity at the intended, on-target genomic target loci. The paired-end FASTQ files were trimmed for readthrough of the Illumina TruSeq adaptor and amplicon primers using Cutadapt version 2.10. Quality filtering (≥Q30) and quality control were performed with fastp version 0.20.0. FASTQ files were aligned to the human genome reference (GRCh38) using Minimap2 version 2.24. Reads that had a mate overlap a 15-bp window around the target site were included. The R version 4.0.3 package CrispRVariants version 1.18.0 was used to characterize and quantify editing. Read pairs were classified as follows: wild-type reads, reads identical to the reference genome; single-nucleotide variants (SNVs), reads that only included mismatches to the reference; indel-containing reads, reads that included single contiguous insertions or deletions; complex variant, reads containing complex variants, multiple indels and/or SNVs; other, reads containing more complex structural rare variants. Percentage editing served as an estimate of overall measure of nuclease activity at a given target and was calculated as ((total reads – SNV-only reads – wild-type reads)/total reads) × 100. Similarly, the percentage of editing variants was calculated as (total reads carrying a given editing variant/total reads) × 100. For the samples dosed with the BCL11A ABE cargo, reads that contained an exact match to the 7 bp upstream and downstream of the guide were quantified for the presence of the two A>G mutations in and directly after the GATA1-binding site. Read counts for each of the motif variants (TTTATCG, TTTGTCA and TTTGTCG) were obtained for the four BCL11A ABE-edited samples.

RNA-seq analyses

For RNA-seq analyses of PHH and FRG-derived PHH cells (Extended Data Fig. 8), samples were sent to GENEWIZ (Azenta Life Sciences) where total RNA was extracted from each sample, followed by library prep according to Illumina specifications and sequencing with 150-bp paired-end reads on an Illumina HiSeq platform. The resulting raw sequencing reads were checked with MultiQC version 1.14 and trimmed with Cutadapt version 4.4 and fastp version 0.23.2. The trimmed reads were mapped to the human genome reference (GENCODE GRCh38 v43) using the STAR version 2.7.10b aligner and quantified with salmon version 1.9.0. The PHH FRG samples were mapped to the human genome reference and the mouse genome reference (GENCODE GRCm39 v32) and only the human genes were used in subsequent analyses for these samples. Genes with ≤2 counts in <3 samples were filtered from subsequent analyses. Differential expression analyses for pairwise comparisons were processed using DESeq2 version 1.40.2. Differently expressed genes were defined with |log2FC| > 2 and adjusted P value < 0.01. Multiple groups were compared using a one-way analysis of variance (ANOVA) and Bonferroni correction with α = 0.05. Genes that passed the correction threshold were visualized in R version 4.3.0 with pheatmap version 1.0.12.

For the HEK-293 Lenti-X, Hulec-5a, HUVEC, Raji, SupT1 and human primary cells (Supplementary Fig. 9b), sample processing and sequencing for RNA-seq were completed by GENEWIZ (Azenta Life Sciences) with 150-bp paired-end reads on an Illumina HiSeq platform. The RNA-seq reads were processed by GENEWIZ, where the reads were mapped to the human genome reference (GRCh38) using STAR aligner. A single representative sample was chosen from publicly available data for each of the Caco-2 and KG1A cell lines and is indicated in the legend information in Supplementary Table 3 (refs. 53,54). The log2-transformed CPM normalized counts (edgeR version 3.42.4) were visualized in R version 4.3.0 with pheatmap version 1.0.12 for the immortalized and primary cell lines with the six genes of interest (LDLR, ASCT1, ASCT2, CD133, VLA-4 and VLA-5). No statistical analyses were conducted in this context.

Insertion site analysis

The insertion site analysis shown in Extended Data Fig. 1e,f was performed as previously described55. Briefly, the protocol was based on the fragmentation by sonication of genomic DNA isolated from CD34+ cells, followed by ligation of a linker cassette (LC) containing a known sequence and a unique molecular identifier (UMI) for abundance estimation. Each ligation product underwent two rounds of exponential PCR amplification using long terminal repeat-specific and LC-specific primers. An additional PCR step was carried to introduce into each sample Illumina adaptor sequences and a known sample index sequence for multiplexing. A pooled sample library was then loaded on a MiSeq System (Illumina) and the sequence analysis was performed using the IS-Seq computational pipeline55. For the bubble plot generation and the Shannon diversity index calculation reported in Extended Data Fig. 1e,f, we used UMI counts as the insertion site abundance quantification method.

Histology

All immunohistochemistry was performed on the Leica Bond RX fully automated staining instrument running Bond 6.0 software. Antibodies were all diluted in SignalStain antibody diluent (Cell Signaling Technology, 8112). Dual chromogenic staining of human nucleoli (Abcam, 190710, clone NM95) at 1:4,000 dilution was performed using the bond polymer refine detection system (Leica Biosystems, DS9800) with EDTA antigen retrieval using bond epitope retrieval 2 (Leica Biosystems, AR9640) for 20 min at 90 °C. Staining of GFP (Cell Signaling Technology, 2956, clone D5.1) at 1:50 dilution was performed using the bond polymer refine red detection system (Leica Biosytems, DS9390) with EDTA antigen retrieval using bond epitope retrieval 2 (Leica Biosystems, AR9640) for 20 min at 90 °C. Counterstaining was performed using hematoxylin from the bond polymer refine red detection system (Leica Biosytems, DS9390). Slides were air-dried, submerged in xylene and cover-slipped using CV mountant (Leica Biosystems, 14070936261). Whole-slide imaging was performed on the Leica Versa 200 microscope at ×40 magnification. TIF files were extracted from the Leica SVS scan files using Leica Aperio Imagescope version 12.4.6.5003. Sternum and liver samples were fixed in 10% neutral buffered formalin (Epredia, 534801) for 48 h at room temperature. Following fixation, sternum samples were decalcified overnight using Formical-2000 decalcifier (StatLab, 1314). Both sternum and liver samples were processed on the Leica Pearl tissue processor through a graded ethanol series (Fisher, A962P4) as follows: 70%, 80%, 90%, 100%, 100% and 100%. This was followed by three xylene solvent steps (Fisher, X3P) and three paraffin steps (Paraplast, Leica, 39601006) using the default 12-h protocol. Samples were embedded using the Leica HistoCore Arcadia embedding center and sectioned at 5 µm on the Leica HistoCore Multicut microtome.

Assessment of HbF induction in CFU

CD34+ sorted human HSPCs harvested from mouse BM were seeded at 5 × 102 cells per tube in Methocult classic (StemCell Technologies, 04444, lot 1000138489). Two plates were seeded per animal from the same Methocult tube. Cells were incubated in StemSpan SFEM II medium with PS (Gibco). On day 14, cells were removed from the incubator and counted for BFU/CFU and GMs. Flow analysis was performed using antibodies CD71 BB700 (BD Biosciences, 746082, clone M-A712), HbF PE (BD Biosciences, 560041, clone 2D12) and CD235a PE/Cy7 (BD Biosciences, 563666, clone HIR2).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Leave a Comment