Hilary Parker
Hilary S. Parker | |
---|---|
Alma mater | Pomona College (BA) Johns Hopkins Bloomberg School of Public Health (MHS, PhD) |
Scientific career | |
Fields | Biostatistics, data science |
Institutions | Etsy Stitch Fix |
Thesis | Practical statistical issues in translational genomics (2013) |
Doctoral advisor | Jeffrey T. Leek |
Hilary S. Parker is an American biostatistician and data scientist. She was formerly a senior data analyst at the fashion merchandising company Stitch Fix. Parker co-hosts the data analytics podcast Not So Standard Deviations with Roger Peng. She received her PhD in biostatistics from the Johns Hopkins Bloomberg School of Public Health and has formerly been employed by Etsy.
Life and education
Parker graduated from Pomona College in 2008 with a bachelor's degree in molecular biology and mathematics. After earning her MHS, she obtained her PhD in biostatistics from the Johns Hopkins Bloomberg School of Public Health in 2013.[1] Parker resides in San Francisco.[2]
Parker's scientific research began during her PhD in the areas of genomics and personalized medicine. Her research looked at factors like batch effects and their impact on prediction.[3][4] Working alongside Jeffrey T. Leek, Parker developed methods for the application of genomic technologies in personalized medicine.[5] Batch effects confound data produced by genomic sequencing technologies, like microarrays. Parker's work aims at correcting predictions that are influenced by the batch effect. This helps mitigate the effects of confounded genomic data. This is of importance since the data is used for diagnosis.[6] In her dissertation, "Practical statistical issues in translational genomics," Parker proposed frozen surrogate variable analysis (fSVA) to improve prediction accuracy in public genomic studies and simulations.[7]
Career and research
After her PhD, Parker went on to work as a data scientist in industry. Her first job was as a data analyst (later, senior data analyst) at Etsy, where she worked for approximately three years.[8] Parker self-described her position as an internal statistical consultant, eventually focused on developing A/B testing and other experiments run by the company, along with analyzing the resulting data.[8][9] Opportunity sizing, experimentation and impact analysis all play a role in how she helped the company development.[1]
In 2015, Parker began work on the podcast, Not So Standard Deviations, with co-host Roger Peng.[10][11] The pair discuss data analytics, covering statistical computation, data cleaning, and R packages.[10][12] The show is among the more popular data science and statistics podcasts, with over half a million downloads.[13][14] The two also co-authored the book, Conversations on Data Science based on their conversations during the podcast. They recorded their 100th podcast episode live on stage as a keynote presentation at the RStudio-sponsored rstudio::conf 2020.[15]
After leaving Etsy, Parker transitioned to a career as a data scientist at personal styling site Stitch Fix. The company employs a human-in-the-loop algorithmic process to generate a recommended box of clothing that is shipped to subscribers.[16][17] Parker optimizes the algorithms the site uses to recommend clothes to people and helps determine what data is needed from clients to determine clothing matches. She has worked on new forms of data generation and helped build datasets powering outfits.[2][16] Parker left Stitch Fix in August 2020 to join the Joe Biden 2020 presidential campaign.[18]
Parker speaks at conferences,[19] often as a keynote speaker.[20] She coined the term "opinionated analysis development" to describe a framework for producing robust data analysis that resembles some aspects of software design.[21][12]
Awards
In 2012, Parker received the Helen Abbey Award from Johns Hopkins. This award is given to a student who intends to teach biostatistics.[22]
Selected works
Parker has contributed to several different publications and projects including the following:
- Leek, Jeffrey T.; Johnson, W. Evan; Parker, Hilary S.; Jaffe, Andrew E.; Storey, John D. (March 15, 2012). "The sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments". Bioinformatics. 28 (6): 882–883. doi:10.1093/bioinformatics/bts034. ISSN 1367-4803. PMC 3307112. PMID 22257669.
- Parker, Hilary S.; Leek, Jeffrey T. (January 16, 2012). "The practical effect of batch on genomic prediction". Statistical Applications in Genetics and Molecular Biology. 11 (3): Article-10. doi:10.1515/1544-6115.1766. ISSN 1544-6115. PMC 3760371. PMID 22611599.
- Parker, Hilary (January 30, 2013). "Hillary: The Most Poisoned Baby Name in U.S. History". The Cut.
- Parker, Hilary S.; Leek, Jeffrey T.; Favorov, Alexander V.; Considine, Michael; Xia, Xiaoxin; Chavan, Sameer; Chung, Christine H.; Fertig, Elana J. (October 2014). "Preserving Biological Heterogeneity with a Permuted Surrogate Variable Analysis for Genomics Batch Correction". Bioinformatics. 30 (19): 2757–2763. doi:10.1093/bioinformatics/btu375. ISSN 1460-2059. PMC 4173013. PMID 24907368.
- Peng, Roger D.; Parker, Hilary (2016). Conversations On Data Science. Leanpub. Retrieved August 10, 2020.
- Parker, Hilary (August 2017). "Opinionated Analysis Development" (PDF). PeerJ. doi:10.7287/peerj.preprints.3210v1. Archived from the original (PDF) on November 12, 2020. Retrieved August 10, 2020.
References
- ^ a b "About Hilary". Not So Standard Deviations. July 28, 2012. Archived from the original on June 23, 2021. Retrieved August 10, 2020.
- ^ a b "QCon San Francisco | Hilary Parker | R Enthusiast, Co-Host of the Not So Standard Deviations Podcast, & Data Scientist @stitchfix". QCon San Francisco 2020. Retrieved August 10, 2020.[permanent dead link]
- ^ Leek, Jeffrey T.; Johnson, W. Evan; Parker, Hilary S.; Jaffe, Andrew E.; Storey, John D. (March 15, 2012). "The sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments". Bioinformatics. 28 (6): 882–883. doi:10.1093/bioinformatics/bts034. ISSN 1367-4803. PMC 3307112. PMID 22257669.
- ^ Parker, Hilary S.; Leek, Jeffrey T. (January 16, 2012). "The practical effect of batch on genomic prediction". Statistical Applications in Genetics and Molecular Biology. 11 (3): Article-10. doi:10.1515/1544-6115.1766. ISSN 1544-6115. PMC 3760371. PMID 22611599.
- ^ Parker, HS; Leek, JT; Favorov, AV; Considine, M; Xia, X; Chavan, S; Chung, CH; Fertig, EJ (October 2014). "Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction". Bioinformatics. 30 (19): 2757–63. doi:10.1093/bioinformatics/btu375. PMC 4173013. PMID 24907368.
- ^ "Hilary Parker: Research". Johns Hopkins Bloomberg School of Public Health. Archived from the original on December 27, 2017. Retrieved July 30, 2020.
- ^ Parker, HS; Corrada Bravo, H; Leek, JT (2014). "Removing batch effects for prediction problems with frozen surrogate variable analysis". PeerJ. 2: e561. doi:10.7717/peerj.561. PMC 4179553. PMID 25332844.
- ^ a b "Hilary Parker Gets Crafty with Statistics in Her Not-So-Standard Job". This is Statistics. January 11, 2016. Archived from the original on November 7, 2021. Retrieved July 30, 2020.
- ^ Machlis, Sharon (January 11, 2017). "RStudio's new enterprise platform moves out of beta". Computerworld. Archived from the original on November 11, 2020. Retrieved July 31, 2020.
- ^ a b "Not So Standard Deviations". nssdeviations.com. Archived from the original on February 28, 2021. Retrieved August 10, 2020.
- ^ Lowndes, Julia Stewart (February 20, 2020). "rOpenSci's Leadership in #rstats Culture". R-bloggers. Archived from the original on November 9, 2021. Retrieved July 30, 2020.
- ^ a b "Not So Standard Deviations: Not Your Average Data Science Podcast". Teach Data Science. June 4, 2019. Archived from the original on November 9, 2021. Retrieved August 10, 2020.
- ^ "25 Super Data Science Podcasts You Must Follow in 2020". Techfunnel. March 2, 2020. Archived from the original on June 17, 2020. Retrieved July 30, 2020.
- ^ Choudhury, Ambika (May 9, 2019). "Top 15 Data Science Podcasts To Subscribe To In 2019". Analytics India Magazine. Archived from the original on January 19, 2021. Retrieved July 30, 2020.
- ^ "Not So Standard Deviations Episode 100". rstudio.com. February 6, 2020. Archived from the original on October 8, 2020. Retrieved July 30, 2020.
- ^ a b Pardes, Arielle (September 12, 2019). "Need Some Fashion Advice? Just Ask the Algorithm". Wired. ISSN 1059-1028. Archived from the original on April 7, 2020.
- ^ Colaner, Seth (July 5, 2020). "How Stitch Fix used AI to personalize its online shopping experience". VentureBeat. Archived from the original on February 22, 2021. Retrieved July 30, 2020.
- ^ Parker, Hilary [@hspter] (August 14, 2020). "Today is my last day at Stitch Fix. Next week I am joining the Biden campaign full-time. 81 days. Let's do this" (Tweet) – via Twitter.
- ^ "Opinionated Analysis Development". rstudio.com. February 12, 2017. Archived from the original on December 2, 2020. Retrieved July 30, 2020.
- ^ "ICOTS10: Scientific Programme: Keynote Speakers". International Conference on Teaching Statistics. Archived from the original on November 26, 2020. Retrieved July 30, 2020.
- ^ Parker, Hilary (August 2017). "Opinionated Analysis Development" (PDF). PeerJ. doi:10.7287/peerj.preprints.3210v1. Archived (PDF) from the original on November 12, 2020. Retrieved August 10, 2020.
- ^ "Honors and Awards: The Helen Abbey Award". Johns Hopkins Bloomberg School of Public Health. Archived from the original on May 2, 2021. Retrieved August 10, 2020.
External links
- Hilary Parker publications indexed by Google Scholar
- Using Data Effectively: beyond Art and Science. Presentation at QCon. November 28, 2018.