17th May, 2023
Dr. Bushra Anjum: Breaking Barriers & Shaping Futures in Big Data
Bushra Anjum, Ph.D., is a health IT data specialist currently serving as the Director of Data Science & Analytics for a San Francisco based health tech firm Doximity. She leads a team of analysts, scientists, and engineers to create HIPAA secure data-driven tools for over 1.8 million USA clinicians.
Formerly a Fulbright scholar from Pakistan, Dr. Anjum served in academia (both in Pakistan and the USA) for many years before joining the tech industry. She continues to volunteer at Cal State University campuses, and public schools, teaching CS-related content and serving as an industry mentor.
Alongside, Dr. Anjum is also serving as a community leader and mentor at GlobalTechWomen, Pakistani Women in Computing (PWiC), CRA Widening Participation (CRA-WP), Rewriting the Code (RTC), Empowering Leadership Alliance (ELA), LeanIn.org, TechGirlz, CodeGirls, among others.
Dr. Anjum is a celebrated speaker at the Grace Hopper Celebration of Women in Computing Conference, Richard Tapia Celebration of Diversity in Computing Conference, STARS Celebrations, CRA-WP Grad Cohorts, to name a few. She has been recognized by Tribune as a Top 20 under 40 Professional for career excellence with a deep commitment to community service. She has been nominated as the Volunteer Woman of the Year 2021 by SLO County Commission on the Status of Women and Girls. She is also a recipient of the LUMS VC Alumni Achievement Award for sustained excellence in engineering, and leadership in the profession and public affairs.
Dr. Anjum received her Ph.D. in Computer Science at North Carolina State University in 2012 for her doctoral thesis on Bandwidth Allocation under End-to-End Percentile Delay Bounds.
What got you interested in Data Science?
Most of my educational and career choices have been guided by the love of exploring the unknown. After completing MSC from LUMS in 2007, I opted for a Ph.D. degree as it was an ambitious adventure. No one in my family has a Ph.D., and going halfway across the world to earn it made it even more exciting. Hence, I came to the US, studied, and completed my Ph.D. (CS) from North Carolina State University (NCSU) in 2012, with a specialization in performance evaluation and queueing theory.
The next adventure was to teach, and I did so in Pakistan at FAST-NU Lahore and NCSU and Missouri S&T in the US. Then I made the switch to the tech industry. I was curious to explore the world of global technology giants; thus, I joined Amazon. I worked there for four years and learned a great deal about large-scale distributed systems with an emphasis on highly scalable fault-tolerant engineering. Next, I aimed to combine my engineering skills with my love for math, advanced probability, and statistics. Thus the field of Data Science became the perfect candidate. Fueled by my curiosity to learn more about the startup culture, I decided on my next adventure, i.e., joining (and now leading) the Data Science & Analytics team of a San Francisco based (then startup, now public) company Doximity. This is where I currently am.
What excites you the most about being a Data Scientist?
There are several exciting aspects of being a Data Scientist. I can write a whole essay on them, but for now, here are a few thoughts.
Being a Data Scientist encourages you to develop deep technical expertise in both the theory and practice of modeling and analytics. Data science involves working with sophisticated algorithms and machine learning models, which in turn requires a deep understanding of statistical and computational techniques feeding those algorithms.
As a Data Scientist, you have the ability to work across a wide range of industries and domains. Data science is applicable to almost any industry, from healthcare and finance to retail and entertainment, each with its own unique challenges and opportunities.
Data science is also a highly collaborative field, so you will have the opportunity to work with experts from a variety of backgrounds, such as statisticians, engineers, developers, and business leaders, to develop and implement data-driven solutions. This collaboration leads to a diverse range of ideas and perspectives, which ultimately leads to better outcomes.
What distinguishes your work from that of your contemporaries?
I have a Ph.D. in Performance Evaluation and Queueing Theory and am the Director of Data Science & Analytics at the health tech firm Doximity in the USA. One of the most rewarding aspects of being in the data field is the ability to use data to make a positive impact on society.
It is surprising how the healthcare system is one of the last benefactors of the digital revolution. Doximity is a digital collaboration platform exclusively for medical professionals, with over 80% of USA doctors as its members. Our mission is to save doctors time by providing them with digital workflow tools so that they have more time to do what they do best, take care of patients. My team, the data department consisting of data analysts, data engineers, and ML engineers, is building out advanced analysis pipelines and prediction models to better support digital fax, appointment and referral, telehealth, continued medical education, and other services for our medical professionals.
At the beginning of 2023, our team released a beta version of a Large Language Model (LLM) powered tool DocsGPT that streamlines mundane administrative tasks for doctors and nurses, such as drafting letters to insurers, appeal denials, and post-procedure instructions for patients.
Our alpha version is open to the public for feedback at DocsGPT. Our work got national coverage, including this recent piece in Forbes InnovationRx Newsletter. It will not be incorrect to say that the team, and Doximity in general, is leading the efforts of bringing the healthcare systems to the 21st century.
What would be your advice to young and aspiring data science students who are trying to navigate the world of data science?
Data Science is an ever-evolving field. As new data sources and technologies emerge, data scientists must continue to develop their skills and knowledge to stay ahead of the curve. This continuous learning and growth can be challenging, but it also makes the field of data science exciting and dynamic. Here are a few suggestions:
- Focus on the fundamentals: It is essential to build a strong foundation in mathematics, statistics, computational, and exploratory data analysis (EDA) techniques.
- Learn programming languages and tools: Programming languages such as Python or R are essential tools for data science, and so is proficiency in using tools such as Jupyter Notebook, Google Colab, and libraries such as Pandas, Numpy, and Scikit-Learn.
- Practice with real-world data problems: It is crucial to apply your skills to real-world problems to gain experience and develop your problem-solving skills. There are many public datasets available that you can use to practice and develop your skills.
- Develop communication skills: Data scientists need to communicate complex ideas to non-technical stakeholders. Take courses in technical writing, public speaking, and data visualization to improve your ability to communicate insights effectively.
After having experience teaching locally and internationally, could you expand on the ways our tech education varies from international universities? What could be improved?
One thing I appreciate in the US is the increased focus, in the last decade, on introducing computer science and data science concepts to children at an early age. I have seen how it helps foster interest and curiosity in these fields as those children become young adults and choose their disciplines in college/university. There are several organizations, like HourOfCode, Code.org, etc., that provide free CS curricula for students as young as 4th graders. I would love to see our schools in Pakistan adopting some of these “early exposure” techniques as well.
Second, we should encourage our students to endorse the generalist mindset. Computer science and data science are multidisciplinary fields that require knowledge in areas such as mathematics, statistics, and computer programming. Also, as we discussed above, they are rapidly evolving fields, which means we must be willing to learn and unlearn on the fly.
Third, at times, the barrier to entry in the Data Science field is too high. People consider that they have to have a Master’s or a Ph.D. degree in order to be successful. While having graduate degrees and research experience certainly helps, these are not the only ways to get into the field and become successful. Reputed bootcamps with an emphasis on hands-on experience can provide opportunities for students to develop the requisite problem-solving skills and gain practical experience. However, this must be augmented with an emphasis on continuous professional development and guidance to attend conferences, read research papers, and take online courses to stay informed and continuously learn. I am sure initiatives like atomcamp can help generate awareness and fill some of the gaps for our Pakistani students.
What does the future of Big Data look like?
Bright, exciting, and ever-evolving!
We can expect to see even more advanced machine learning algorithms and AI-powered tools. We already have AI bots that generate conversational texts, images, music, and computer code. Though opinions about the implications are all over the map (from frenzy enthusiasm to the fear of human race annihilation), one thing is for sure, the Generative AI products are here to stay.
As more and more organizations embrace data, we can expect to see more advancement in real-time and streaming data processing technologies. Similarly, as more and more devices become connected to the internet, we can expect to see an explosion in the amount of data generated by IoT devices. There is a need for individuals who can analyze, interpret, and derive insights from this data.
As AI and machine learning algorithms become more powerful, there will also be a greater focus on ensuring that these algorithms are used ethically and responsibly. This will involve careful consideration of issues such as bias, fairness, and accountability. Similarly, we can expect an increased emphasis on data privacy and security laws and considerations.
The demand for professionals with expertise in Data Science and Analytics is only going to increase. Add to that, the field has the potential to drive meaningful change in a wide range of industries, from climate to healthcare to finance to social justice. Whether you are driven by financial incentives or impactful work in your career or both, Data Science is the field to consider.
Picture Courtesy: Dr. Bushra Anjum