Over a million developers have joined DZone.

The Value of Historical Transit Data When It Comes to Machine Learning

DZone's Guide to

The Value of Historical Transit Data When It Comes to Machine Learning

Historical transit data and machine learning represent one of many opportunities for transit authorities to tap when it comes to looking for new revenue opportunities.

· AI Zone ·
Free Resource

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.

I’m working through the different ways that transit authorities can generate more revenue from their data using APIs as part of my work with Streamdata.io. Making data streaming and truly more real-time is the obvious goal of this research, but Streamdata.io is invested in transit authorities take more control over their data resources and using APIs to generate revenue at a time when they need all the revenue they can possibly get their hands on.

One overlap in the projects I’m working on with Streamdata.io is where transit data intersects with machine learning and artificial intelligence. I’m not sure what transit authorities are doing with their historical data, but I know that it isn’t available via their APIs and developer portals. I’m guessing they see historical data about schedules, vehicles, riderships, and other data points as a burden, and once they’ve generated the reports they need, they don’t do anything else with it. This historical data is a goldmine of information when it comes to training machine learning models that can then be better used to understand ridership, make predictions, and understand maintenance, scheduling, and other aspects of transit operations — let alone commerce, real estate, and other demographic data.

There is a dizzying amount of investment going into machine learning and artificial intelligence right now, and this is something that could be routed to transit authorities to help boost revenue. If all historical data on transit operations were digitized and available via APIs, then metered using modern API management approaches, it could be an entirely new revenue opportunity for transit authorities. Transit systems are the heartbeat of the cities they operate within, and historical data is the record of everything that occurs, which can be used to develop machine learning models for the transit industry, as well as real estate, commerce, and other sectors that transit systems feed into on a daily basis, and have for years.

I don't know what data transit authorities possess. I don’t know how much historical data they keep around or what is required by government regulators. But I do know that whatever there is, it has value. I’ve studied how API management is being used by tech companies for almost eight years now, and it is how value is created and revenue is generated — something that transit authorities and leadership need to realize applies to them in a digital age. They are sitting on a wealth of historical data that would be of value to tech companies who are already mining their existing schedules and real-time vehicle data. Historical transit data and machine learning just represent one of many opportunities on the table for transit authorities to tap when it comes to looking for new revenue opportunities in the future.

Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.

ai ,machine learning ,streaming data ,real-time data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}