All you need to know about Generative AI

People have been speculating about the possibility that machines would someday think since the late 1800s, bu the idea really took root with Alan Turing's seminal paper in 1950 .Historians called Turing the father of AI.

He theorized that we could create computers that could play chess, that they would surpass human players, that we could make them proficient in natural language. He theorized that machines would eventually think

If Turing's 1950 paper was the spark, just six years later, we had the big Bang, the Dartmouth workshop

A couple of young academics got together with a couple of senior scientists from Bell Labs and IBM. And proposed an extended summer workshop with just a small handful of top people in adjacent fields to intensively consider artificial intelligence.

That is how the phrase artificial intelligence was coined, and it marks the point at which AI was established as a field of research.

They laid out in extensive detail many of the challenges that we've been working all these years to solve And develop machines that could potentially think, neural networks , self-directed learning, creativity, and more, all still relevant today.

For perspective, this was 1956, the same year the invention of the transistor won the Nobel Prize.

Now, we can have over 100 billion transistors on a GPU, and banks and banks of interconnected GPUs to provide the compute power to create and execute generative AI functions. All these years, the AI theories, techniques, and ideas have been developed in parallel with progress in hardware that resulted in dramatic reductions in compute and storage costs, all converging now to make generative AI real and practical.

It's not just about powerful hardware and clever algorithms. The third, and maybe the most important ingredient, particularly when it comes to your business, is data.

You can't talk about generative AI without talking about data. It's the third leg of the AI stool, model architecture, plus compute, plus data.

You hear about large language models, or LLMs, that are powering generative AI. So what are they?

At a basic level, they're a new way of representing language in a high dimensional space with a large number of parameters, a representation you create by training of massive quantities of text.

From that perspective, much of the history of computing has been about coming up with new ways to represent data and extract value from it. We put data in tables, rows of employees or customers, and columns of attributes in a database.

This is great for things like transaction processing or writing checks for payments to individuals.

Then we started representing data with graphs. We start to see relationships between data points. This person, or this business, or this place is connected to these other people, or businesses, and places . Data represented this way starts to reveal patterns, and we can map a socia network or spot anomalous purchases for credit card fraud detection.

Now, with large language models, we are talking lots of data and representing it in neural networks that simulate an abstract version of brain cells. Layers and layers of connections, with tens of billions or hundreds of billions, even trillions of parameters.

And suddenly, you can start to do some fascinating things. You can discover patterns that are so detailed that you can predict relationships with a lot of confidence. You can predict that this word is most likely connected to this next word. These two words are most likely followed by a specific third word, building up, reassessing, and predicting again and again until something new is written. Something new is created or generated, that's what generative AI is.

The ability to look at data and discover relationships and predict the likelihood of sequences with enough confidence to create or generate something that didn't exist before. Text, images, sounds, whatever data can be represented in the model.

With deep learning, we started representing a massive amount of data using very large neural networks with many layers. But until recently, a lot of the training happened using annotated data, this is data that humans would label manually.We call this supervised learning, and it's expensive and time consuming, so only large institutions were doing that work, and it was done for specific tasks.

But around 2017, we saw a new approach, powered by an architecture called transformers, to do a form of learning called self-supervised learning

In this approach, a model is trained on a large amount of unlabeled data by masking certain sections of the text, words, sentences, etc, and asking the model to fill in those masked words. This amazing process, when done at scale, results in a powerful representation that we call a large language model

these LLMs could be trained on huge volumes of Internet data andacquire a humanlike set of natural language capabilities. Self-supervision at scale, combined with massive data and compute give us representations that are generalizable and adaptable.

These are called foundation models, large scale neural networks that are trained using self-supervision, and then adapted to a wide range of downstream tasks.

This means, that you can take a large, pretrained model,ideally trained with trustworthy, industry-specific data, and add your institutional knowledge to tune the model to excel at your specific use cases. You end up with something that is tailored for you, but also quite efficient and much faster to deploy.

Imagine what AI can do for the pace of discovery and innovation, what it can do for discovering new materials, for medicine, for energy, for climate, and so many of the pressing challenges that we face as a species. Ultimately, our success depends on how we approach AI.

We have seen new models, evolved models, and an explosion of open models. Generative AI has gone from being a fascinating novelty to a new business imperative in less than a year. And every day, there is news of a new use case or application.

I'm going to leave you with four main pieces of advice.

Number 1, you want to protect your data, your data and the representations of that data, which, as I just explained, are what AI models are, will be your competitive advantage, don't outsource that, protect it.

Number 2, you have to make sure that you are embracing principles of transparency and trust, so that you can understand and explain as much as possible of the decisions or recommendations made by AI.

Number 3, you want to make sure that your AI is implemented ethically, that your models are trained on legally accessed quality data. That data should be accurate and relevant, but also control for bias, hate speech, and other toxic elements.

And number 4, don't be a passenger. You need to empower yourself with platforms and processes to control your AI destiny. You don't need to become AI experts, but every business leader, every politician, every regulator, everyone should have a foundation from which to make informed decisions about where, when, and how we apply this new technology.

Become a value creator with Gen AI

Even if you don't have a formal AI initiative or an AI team in place, it's already in use across your business. It might be baked into your off-the-shelf applications. It may show up in chatbots, in HR, self-service portals, and transcription services. These are what we might call traditional AI. The focus is on executing discrete tasks.

Usually, each instance is trained individually on its own data that you had to compile and label.

Now what makes generative AI different is that it is enormously flexible and not limited to narrow tasks. Think of it as, instead of filling specific blanks, it can write the whole document from scratch to the point of being able to create or generate something entirely new. What makes this possible are foundation models.

Foundation models

Foundation models are trained the same way as traditional AI. They're trained using self-supervised learning.

You don't have to manually annotate a massive amount of data. You tell them to go read enormous amounts of texts, and they do. You end up with a large adversatile model with more human-like language capabilities.

Algorithms use mathematical models to represent the relationships between the words they ingest. If you give the model a few words in a prompt, it can mathematically predict the likelihood of words in the response.

Instead of needing to build one AI model for each specific task, you can train one model and adapt it to many very downstream tasks.

We went from one task, one model to one model Many tasks, your chatbot, and your HR self-service can be built on the same model as the new app that will write your marketing emails and summarize legal documents.

That's the first critical point. Ideally, a model isn't the final form of II is just the foundation you build on. How you use it, it is up to you.

Modes of AI consumption

When it comes to using AI. There are basically three modes of consumption.

  1. Embedded AI:

    The first is embedded AI, which I already mentioned, is baked into the shelf software. The software vendor creates the AI and you put it to use in your business. Whether it's a writing assistant that can help you strike the right tone in your email or image editing software that can automatically process your images.

    You get access to some great functionality that can make you more productive, which we always want. But the caveat is that what you can buy, so can everybody else. Those capabilities and productivity opportunities don't become differentiators.

    They set a new, higher baseline for everyone.

  2. The second mode of consumption is API calls.

    As you develop custom applications for your business, they can call out to another company's AI service using that company's models and processes and then return the results. This is also a viable way of consuming AI. Depending on how cleverly you use the APIs and the diversity of AI service providers you use,

    you can start to differentiate how you put AI to work relative to your competitors.

    But there are caveats here too. The first is that, like with software, the models and services you tap into are available to everyone.

    The second is that when that API call goes out, it's connecting to what looks like a black box, you don't necessarily know what's happening on the other end or the provenance and governance of the data use which can make people nervous

    because your business is still accountable for the final outcomes.

    A second word of caution when using someone else's AI has to do with the creation and accrual of value over the long term.

    In the past, we've seen a lot of value-extractive business models.

    Another company will offer you a service like this API call, and you get value from that.

    But the other company is also extracting value from your usage and from your data, accumulating more and more.

    How much faster is their value growing than yours? Well, you can see it in the stock prices over time. here is an imbalance in the relationship that can have long-term consequences, both for your specific business and for the overall economy and progress of technology. This goes back to the metaphor I used earlier. Philosophically, do we as a society really want just a few keepers of the fire upon which we are all dependent? Is that what's best for your individual business, for your shareholders?

  3. Platform Model

    The third model of AI consumption is the platform model. This is the most comprehensive. This is how you become your own AI, Firestarter.

    No, it doesn't mean doing it alone and reinventing AI from scratch, and spending years and millions of dollars to build your own models.

    With a platform, you have all the elements and ingredients in place to build your own AI solutions. You have foundation models, you have tools to improve and customize models, and you have processes to build your own tailored AI solutions. Importantly, you create an accrued value that is unique to your business.

Foundation models deep dive

Starting with foundation models. Foundation models are large-scale, deep neural networks, trained with lots of unlabeled data and subsequently adapted to many downstream tasks. It may be a broad, general model or it may be a narrower, deeper model. But the key is that it is pre-trained with the expectation that you can further enhance it with your own proprietary data

Like when a new employee joins your business, they come in with some general skills as a foundation and the ability to learn. The more they learn about your business, the more they add institutional knowledge and expertise, the more value they deliver. Well, the same is basically true of a foundation model.

You use your AI platform to tune it with your specific business data, your proprietary knowledge, and expertise. It becomes more expert and more valuable for your business over time. Because you're in control of the platform and the processes and the data, you accrue ever larger amount of value over time.

With some of the consumer AI on the market. We've already seen some of what happens when you surrender that control. You can get bad data that leads to bad outcomes. You can get hallucinations. Basically, AI generating very confident and very incorrect answers. You can get into some trouble for inadvertently using someone else's rights, managed content. We've even seen proprietary or sensitive data being inadvertently leaked back into the public space. That's why you need to know how your model is built and what data was included and it's why tight control of your sensitive data should be prioritized. Strong AI governance is absolutely critical.

There is, I think, a myth about AI right now, a basic misunderstanding. For general, public generative AI has seemingly come out of nowhere. A lot of people think that there's a handful of consumer oriented AI experiences out there and that one model is going to win, or just a few liters. I don't think that that's how it's going to play out.

The future of AI is not about one model, it's multi model. Your business will be using multiple fine tuned models to achieve the best results when applied to specific use cases. That's why the platform approach is so important. Realistically, the future of AI is not only proprietary. It will also be powered by open science and open source.

Proprietary models will play a part. But so much of what is going to happen in the future will not happen behind closed doors. It will play out in plain view with full transparency and accountability that open Source provides. The energy in the open Source community is phenomenal. They are distributed projects, university projects, corporate efforts, all driving innovation. Producing foundation models that you can tune and deploy for your use cases.

Hugging face is like Github for foundation models, and there are over 325,000 open source models available and thousands more being added. This is exactly how it should be for the good of society in the long term, we don't want just one or a few winners, a few companies that can define what .AI is and dictate how it's used.

Why Foundation Models are a Paradigm Shift for AI

Training a new large language model is a bit like launching a rocket.

3,200+ Rocket Launch Stock Videos and Royalty-Free Footage - iStock | Space  shuttle, Launch pad, Product launch

It's exciting.

It's resource-intensive

it requires an enormous amount of compute power

And the training process takes months

So you need intensive planning and preparation to make sure you've got the latest and best technologies in place. Because once you press go and the GPUs fire up and start training, the rocket has lift off.

You can no longer tweak the design. Any new innovation has to wait until the next launch. Just like rocket launches change the frontier of science, large language models and the broader class of generative AI that they belong to called foundation models represent a paradigm shift in how the world is going to leverage AI .

While traditional AI can analyze data and tell you what it sees, generative AI can use that same data to create something new, and that's a vital tool for businesses to have because that same power can be applied to customer service and support, code generation for developers, extracting key information from complex documents. More use cases are being developed every day. Companies can increase productivity, reduce costs, and open up new lines of business.

While traditional machine learning is narrowly focused and purpose built for a specific task and takes a lot of human intervention, foundation models are bigger, broader, general purpose models that benefit from unsupervised learning, which means they can be trained on large unlabeled datasets.

Then afterwards, this general purpose model can be further tailored for an array of applications. The types of things these models can do is evolving incredibly quickly. So now is the time to start building your expertise.

Get started

When looking to get started, building expertise is critical.

First, you need to establish a team of people who can become comfortable and fluent working with foundation models so that they can experiment, testing out new models as they become available, prototyping on example use cases, and so on.

The second step is to pick an internal low-risk use case that you can use as a testing ground. You could build a prototype and test out deployment, then use what you learned as your team gains more experience.

Third, you need to have an in-depth conversation about what you require to get those real value drivers and revenue drivers that generative AI can help you unlock. For example, you need to determine what requirements around trustworthiness and other regulatory issues. Your models need to meet to be deployed in production. All those questions only become more relevant as you leave the experimentation phase and get into the actual building of a model for real on an application that can drive business impact.

Finally, you need to be able to operate with a level of responsibility and transparency. You've got to be transparent regarding data collection, showing what is and isn't in your data and how it all gets filtered and managed. You need to be able to explain how your AI is making decisions. You want it to be fair and trustworthy and ready for compliance with upcoming regulations.

When you're getting started on your journey with generative AI and looking at all the options available, my recommendation is to start simple. Start with a pretrained model and try to do light customizations with your own data through a process called tuning. This way you can tailor the model for your specific use cases while taking advantage of the large, general purpose capabilities that other providers have developed. It's important though to update those pretrained models every couple of months.

The field and the regulatory guidancearound it is constantly evolvingso models that aren't regularly retrained with the latest best practiceswill quickly become stale. That's why the right AI and data platform is so important. You should look for a platform that has proven expertise in foundation models, the governance tools in placeto help you address potential ethical concerns and can help you transition from experimentation to deployment. Then as you get better and more confident over time in training and owning the models, you'll eventually be able to maintain and build them out on your own.

There's a lot of complexity to AI in foundation models, but working through all that complexity truly is worth it for where it's going to take us both in terms of our business successes and our progress as a society. Think about those NASA scientists and engineers. Doing something new is never easy. But because they did the work, we've set foot on the moon and sent probes beyond our solar system. We can now explore our universe.

Done!!