![]() ![]() So there’s quite a few limitations both in what it can do, and what it does that it probably shouldn’t do (such as potentially graphic imagery). ![]() It’s kind of like a best guess based on what it’s “seen” before. The algorithms don’t “understand” what the words mean or the images in the same way you or I do. I can’t stress enough that this isn’t intelligence. They are marking themselves.īut there’s still weaknesses in this generation process, right? Similar to how these models were initially trained to predict the best captions for images, they only show you the images that best fit the text you gave them. When you use one of these models you’re probably only seeing a handful of the images that were actually generated. ‘Exercises’? Are there still actual people involved, like telling the algorithms if what they’re making is right or wrong?Īctually, this is another big development. These algorithms are fed massive amounts of data and forced to do thousands of “exercises” to get better at prediction. One of the most hyped text-to-image generating algorithms, DALLE, is based on GPT-3 more recently, Google released Imagen, using their own text models. In 2020 a company named OpenAi released GPT-3 – an algorithm that is able to generate text eerily close to what a human could write. There’s been a number of developments in techniques, as well as the datasets that they train on. So what’s changed so that the stuff it makes doesn’t resemble completely horrible nightmares any more? So the fact we are getting recognisable kangaroos and several kinds of cheese shows how there has been a big leap in the algorithms’ “understanding”.ĭang. When given the caption “a herd of giraffes on a ship”, it created a bunch of giraffe-coloured blobs standing in water. If you look at this blog post from 2018 you can see how much trouble older models had. When these programs are making new images, is it finding commonalities – like, all my images tagged ‘kangaroos’ are usually big blocks of shapes like this, and ‘cheese’ is usually a bunch of pixels that look like this – and just spinning up variations on that? Once a model can predict what an image “should” look like based on a caption, the next step is reversing it – creating entirely novel images from new “captions”.ĭALL♾ mini is an AI model that generates images from any prompt you give /BDqBMaO5eH- Flo June 5, 2022 The algorithms identify patterns in the images and captions and eventually can start predicting what captions and images go together. These are usually scraped from the internet, along with the captions associated with them. I’m no expert, but essentially what they’ve done is get a computer to “look” at millions or billions of pictures of cats and bridges and so on. You mentioned that it’s all a load of maths before, but – putting it as simply as you can – how does it actually work? But here’s nine of them generated by a model. If you do a Google search for “a kangaroo made of cheese” you won’t really find anything. But the reason you’re hearing about AI art is because of the ability to create images from ideas no one has ever expressed before. You aren’t thinking outside the box enough! Sure, you can create all the Kermit images you want. AI-generated art of ‘kangaroo made of cheese’. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |