Artificial Intelligence is incredible - but not inherently. Learn how AI has to be trained before it can reach its unlimited potential.
Today’s digital world is already rife with systems and technologies only made possible by advances in artificial intelligence (AI). Have you been shopping online recently, or interacted with any kind of online media? Chances are AI was utilized in the optimization of targeted ad and product placement algorithms. Spoken to Siri, or translated something on Google? That employs AI-assisted natural language processing. Taken a photo with a recently released smartphone? AI is applied on top of your phone’s camera hardware, empowering the picture you took with improved visuals. With diverse learning algorithms, prediction models, deep learning, and more, AI can be molded to empower practically any industry.
As we’ve detailed in a previous article, the applications of AI are extremely vast. Even with the advent of deep learning, neural networks, and other impending breakthroughs, we’ve merely scratched its surface. No, it can’t yet replace the intuition, experience, wisdom or the complexity of a trained human mind, but it’s here that humans and AI intersect: we both learn. AI is as capable as it is because it has the potential to learn even more than we humans can.
Yet in the same way that a child requires a good education to achieve their true potential, AI does too. AI must be trained with the right algorithms, tools, data sets, and more, just like how a human requires textbooks, research, and practical workshops. Without guidance, neither can be expected to complete complex tasks.
We’ve established that before AI can step in and work its “magic”, human specialists must first provide training instructions – essentially telling the program what to do. After all, machines don’t just develop a personality with independent, innovative thoughts of their own. Note the examples provided in the opening paragraph of this article: all the mentioned activities involve processes and systems that we already have a pretty good grasp of. Specialists in each field already understand the core relationships between different components and aspects of each working process. Still, the question remains: how do they turn that knowledge into a training module for AI?
Let’s think back to the industrial revolution’s physical machinery. Though those complex machines severely reduced the need for manual labor, they did not inherently get any smarter, or more capable. One moving part physically affects another – gears placed and rotating in the most efficient position – but the pieces of metal, wood, or otherwise have no governing “mind” to be aware of anything. Humans have since had to stick around, observing, learning, and innovating.
It’s a little different in today’s digital world. When we digitize information, we’re able to track statistics and data down to extremely precise scales. As the limits of computing power have increased, so has our ability to define, gather, assess, and utilize said data. In fact, another exceedingly important subset of AI that was designed to do these very things is called Machine Learning (ML). Where a physical machine must be improved upon with physical, human limitations, digitized information can simply be fed through a program to train it to do what we can’t.
Those familiar to ML will know that there are three main ways to train ML: supervised learning, unsupervised learning, and reinforcement learning. Each has varying pros and cons, but the most consistent and most employed method is that of supervised learning. Properly orchestrated, it lets us take AI to the next level, able to complete tasks so complex and vast in quantity the human mind simply falls short of.
Supervised learning relies entirely on quality training datasets. Because raw data is literally just, well, data, it isn’t quite useful yet. Making it so requires three steps:
It might seem obvious, but you must have some point of data to be able to create a dataset.
Because the entire basis of datasets relies on high quality AND high quantity, the most important thing is to establish a channel, workflow, or process that allows high volume intake of information from reputable, trustworthy sources.
If the data collected is false, or even too little in quantity, the resulting datasets won’t be very useful as it won’t account for enough parameters. It’s also very important to prioritize organization as much as possible, as will be evident in the next step of the process.
After data is acquired, it must be standardized and ready for active analysis. If your data is for example an image, but said image is too grainy and doesn’t clearly show its intended subject, it requires “cleaning” which either requires additional processing through various types of software or needs to be reacquired altogether.
The same can be said of, say, data using different units of measurement. Without explicit configuration, a computer won’t be able to tell “1” day from “1” hour. These are just a couple examples of how Data Cleaning is essential to the process.
Now that information has been acquired and cleaned, it must be analyzed. This final step is arguably the most important as it’s also often the most complex.
Data labelling is all about precision and accuracy. In this step, physicians take the cleansed data, and determine what the data represents in the real world. Because there’s still so much that is unknown and undetermined, this requires a multi-step reviewing process for the best results.
That’s not to say every dataset is inherently complex. An example of simple data labelling would be a picture of a banana, labelled “a banana”. Pitch that against the infinitely complex system that is our body’s biology, and you can start to imagine just how complex things can be. From visually identifying specific veins, arteries, and pathways in our bodies to discerning the impact of elevated proteins, hormones, and other bits of chemistry within us, it’s a grand ordeal to say the least.
Data labelling is highly useful for physicians of all levels. Physicians currently undergoing intense study and research benefit from being able to practice their newly acquired knowledge, while physicians at the top of their field also get to hone their skills and breadth of experience.
To that end, ML can be trained to consider absolutely every process involved in say, food delivery. Every possible parameter and their relationships are accounted for: types of vehicles, relationship between fuel type and speed, fastest route between restaurant and customer, as well as traffic, location, price details, and more. A properly trained AI/ML-driven program, saturated by all this data, would be quicker, more efficient, and less prone to human error in its analysis. It would completely understand how each data point affects another, and resultantly be capable of picking up on combinations and patterns that lead to the actionable insights such as figuring the most efficient type of vehicle for food delivery on a specific traffic route or otherwise.
A food delivery company with this kind of tech would immediately be privy to vast advantages. They’d know which vehicle type is the most efficient, when to suggest which routes for fastest delivery times, the kinds of restaurants customers seem to enjoy the most, and much more.
Of course, this is all much easier said than done. Gathering the vast amount of data required for such a maximally capable program would require years of precise finetuning, as well as data input for analysis. This right here is where the world’s technologies currently stand. Before we can create, innovate, and expand the boundaries of our knowledge and capabilities, we must precisely understand what we’re working with. That requires the expertise and knowledge of the world’s greatest pioneers in any given subject.
Ever Medical Technologies continues to push the envelope in combining the world’s preeminent technologies with leading physicians’ experiences. We understand that progress isn’t a destination, and is more like a coursing, winding, unending river. While we have employed technologies like blockchain and variations of AI/ML modules in our existing products, we also acknowledge the vast sea of knowledge that remains untapped.
Our desire to aim ever higher led to the establishment of Ever AnnoMed, a clinical research organization that will stand as a beacon for medical researchers across the globe. Its main objective is to facilitate a thriving data labelling community for AI/ML training.
Ever AnnoMed will be able to leverage high quality medical professionals who can label large quantities of data with precision and speed, all at affordable prices. Uniquely positioned alongside a vast network of universities and hospital partners across Thailand, our goal is to learn, grow, and inspire alongside each other. We’re eager to further the mutually beneficial practice of data labelling, supplying even physicians who come from non-AI fields and practices with active learning as well as providing in-house support in the form of our talented AI team that is trained to aid in eradicating false positive/true negative data. We also support a massive range of data types (DICOM, Nifti, ultrasound, text data, etc.), having designed our blockchain core infrastructure to accommodate global data practices and standards.
Like with everything else we set out to do, Ever AnnoMed will uphold the highest standards in quality, precision, and accuracy, all in pursuit of better diagnostic tools, neural-net services, increasingly efficient data storage/transfer, and more. In truth, it is incomprehensible just how much innovation and discovery complete data labelling could lead to - but we’re here for it, and we’re beyond excited to be taking our first step toward training and preparing the world for its next evolution.