Erika Fille T. Legara is a scientist interested in the study of complex systems and artificial intelligence. Prior to joining the Asian Institute of Management (AIM) in 2017, Erika was a scientist at the IHPC, A*STAR, Singapore, where she worked closely with government institutions and various industries on different R&D initiatives. At present, she is the director of AIM’s MSc. in Data Science program, holding an associate professor position. Legara is also a senior scientist at the Analytics, Computing, and Complex Systems lab at AIM.
PhD in Physics, 2011
University of the Philippines
MSc in Physics, 2008
University of the Philippines
BSc in Physics, 2006
University of the Philippines
We develop a numerical model using both artificial and empirical inputs to analyze taxi dynamics in an urban setting. More specifically, we quantify how the supply and demand for taxi services, the underlying road network, and the public acceptance of taxi ridesharing (TRS) affect the optimal number of taxis for a particular city, as well as commuters' average waiting time and trip time. Results reveal certain universal features of the taxi dynamics with real-time taxi-booking—that there is a well-defined transition between the oversaturated phase when demand exceeds supply, and the undersaturated phase when supply exceeds demand. The boundary between the two phases gives the optimal number of taxis a city should accommodate, given the specific demand, road network and commuter habits. Adding or removing taxis may affect commuter experience very differently in the two phases revealed. In the oversaturated phase the average waiting time is affected exponentially, while in the undersaturated phase it is affected sub-linearly. We analyze various factors that can shift the phase boundary, and show that an increased level of acceptance for TRS universally shifts the phase boundary by reducing the number of taxis needed. We discuss some of the useful insights on the benefits and costs of TRS, especially how under certain situations TRS will not only have economic benefits for commuters, but can also save the overall travel time for the shared parties, by significantly reducing the time commuters spend on waiting for taxis. Our simulations also suggest that simple artificial taxi systems can capture most of the universal features of the taxi dynamics. The relevance of the assumptions and the overall methodology are also illustrated using the empirical road network and taxi demand in Singapore.
Here, using an ensemble of machine learning models, a procedure is demonstrated that classifies passengers (Adult, Child/Student, and Senior Citizen) based on their three-month travel patterns. The method proceeds by constructing distinct commuter matrices, we refer to as eigentravel matrices, that capture a commuter’s characteristic travel routine. Comparing various classification models, we show that the gradient boosting method gives the best prediction with 76% accuracy, 81% better than the minimum model accuracy (42%) computed using proportional chance criterion.
In this work, we particularly focus on the complex relationship between land-use and transport offering an innovative approach to the problem by using land-use features at two differing levels of granularity (the more general land-use sector types and the more granular amenity structures) to evaluate their impact on public transit ridership in both time and space. To quantify the interdependencies, we explored three machine learning models and demonstrate that the decision tree model performs best in terms of overall performance—good predictive accuracy, generality, computational efficiency, and “interpretability”.
Framing, in its specific application to media research, is defined as the “central organizing idea for making sense of an issue or conflict and suggesting what is at stake.” It can be found in various disciplines of the social sciences, most notably in political science, psychology, and communication research. Due to the fuzzy nature of frames, identifying them has proven to be quite complex. Here, we perform framing analysis on a corpus of news texts on the population and family planning issue in the Philippines by operating two varying approaches: human-based and computer-assisted. A singular holistic approach to framing is initially implemented where coders/domain experts classify each news text to a specific pre-defined frame. This traditional approach is known to raise serious issues on the reliability and validity of the results mainly due to human’s intrinsic biases. To address such issues, we propose a novel technique that synergically combines the method of Matthes and Kohring (2008) and complex networks approach. In our model, the codings of texts are cast as a network of content analytic variables (CAVs). Our proposed method tackles the clustering issue that MK raised, which plagues framing scholars in the quantitative identification of news frames in texts. Moreover, the research is significant on a societal level as it also aims to gain perspective for reasons on the lack of progress in discussions about suitable population policies in most developing countries like the Philippines.
I am currently the co-project lead of a smart city project funded under the DOST-PCIEERD Industry, Energy and Emerging Technology …
My colleague Chris Monterola and I have been tapped by the Philippines Government through the Department of Trade and Industry (DTI) …
We have an on-going project with the Manila Water, Corp. funded through the DOST-PCIEERD CRADLE grant. The title of the project is …
Aside from teaching at AIM, I also supervise students in R&D especially when they engage industry/government stakeholders as part …
In the past few years, I have been involved in various industry and government projects. Some of them are listed here.
To support and promote women and gender minoritie in ML and DS
Event that brings together experts to share short data stories with the public over free beers.
For International Women’s Day 2020, we’re getting to know the pioneering women across East Asia Pacific who are breaking barriers and …
Data Scientists Needed in Every Industry, Experts Say
Introducing data science as ‘future’ of policy making in PH
This Pinay Is Leading Data Science Education In The Philippines
AIM’s big data bid to regain business school leadership
Filipina Physicist Back from SG to Head AIM’s Data Science Program
Machine-learning program predicts public transport use in Singapore
This course introduces participants to the latest trends in analytics in the era of big data, artificial intelligence, and the Internet of Things. The course explores various data-driven approaches, frameworks, and models used by different industries across functions to improve processes and/or create new and innovative products. In particular, participants will familiarize themselves with the different levels of analytics—descriptive, predictive, and prescriptive, and will be tasked to identify use cases where the approaches can be applied.
Complex Systems are systems composed of heterogeneous agents that are highly interacting and whose interactions result to emergent behavior, e.g. societies, economies, markets, cities, and biological systems like the immune system and the brain, to list a few. In this class, the students will be exposed to various tools used in characterizing and modeling complex systems. The topics include dynamical systems, chaos, fractals, self-organization, cellular-automata modeling, agent-based modeling, and complex networks.
The module covers the basics of Complexity Science with particular focus on Complex Networks (network science), which are the backbones of complex systems (e.g. cities, organizations, economies, and financial markets). Complex networks quantify the interactions of various entities/players in complex systems. Examples of complex networks include social networks like those generated from Twitter, Facebook, and Instagram, financial networks, biological networks, and organizational networks. Students learn how to visualize, analyze, and model complex networks using Python, NetworkX, and Gephi. At the end of the course, students should be able to view and analyze problems in business and marketing, among others, through the lens of complexity science. They should also be able to argue, in descriptive and quantitative manner, why a system-of-systems thinking is necessary to address most real-world issues.
In this course, students learn data science fundamentals that are more in tune with their applications to business; essentially, how the field is applied in the real-world. Students are provided with a comprehensive overview of data science and artificial intelligence—what they are and what they’re not. Students are also exposed to the current state of data science and its future direction(s). The class has data science practitioners share their experiences—from how companies come up with a data strategy toward becoming a truly data-driven organization, to building data science teams, to learning about the challenges companies faced and are currently facing. Participants learn about data workflows and pipelines; they will learn and appreciate how to assemble and lead data science enterprises. Finally, the course also covers the fundamentals of data privacy and data/AI ethics.
In this course, students will learn to appreciate the importance of successful data visualizations and intelligible stories in communicating insights. Using real-world datasets, learners will gain the necessary skills to fashion effective vizzes that exhibit not only good design elements but also layers of information that when weaved together as a narrative can drive stakeholders to take action. Storytelling will be emphasized across the sessions. On a more technical aspect, students, in this course, will also get to widen their visualization vocabulary. In addition, they will be introduced to the different viz tools available including Tableau, QGIS, and Gephi (a network visualization tool). They will also, of course, learn how to create visualizations in Python with pandas, networkx, geopandas, matplotlib, and plotly, among others.