Furthermore, it seems that the embedding is locating age in the horizontal axis and size in the vertical axis. It seems that this embedding is understanding that the words in it have two main properties, or features: age and size. Likewise, moving upwards turns a puppy into a calf and a dog into a cow, which is an increase in the size of the animal. If you look carefully, moving towards the right turns the puppy into a dog, and the calf into a cow, which is an increase in age. The main property of word embeddings that is in effect here is that the two axes (vertical and horizontal) represent different things. However, this is not even the tip of the iceberg. The analogy “A dog is to a cow like a puppy is to a calf” is also captured in this rectangle, as it’s shown in the figure below. The analogy “A puppy is to a dog like a calf is to a cow” can be translated into “The path from the word puppy to the word dog is the same as the path from the word calf to the word cow”. For instance, two analogies are captured here. Why? Because the rectangle formed by the four words captures some very important relationships between them. While it would make sense to locate it in A, closer to “Calf” since they are both bovine, or in B, since it’s an adult animal, like “Dog”, the place where I would put this is in spot C, with coordinates. Where would you add it, in the spot labeled A, B, or C? Now to test your intuition again, I’m going to locate the words “Puppy”, “Dog”, and “Calf” in the plane, and I’ll ask you to add the word “Cow”. Let’s look at four words, “Puppy”, “Dog”, “Calf”, and “Cow”. Could we build a word embedding that captures relations between words, as relations between numbers? In mathematics, numbers can be added or subtracted to get other numbers. In language, words can be combined to get more complicated concepts. There is something more to these word embeddings, and it is that they don’t only capture word similarity, but they also capture other properties of the language. The word embedding above satisfies properties 1 and 2. Word Embeddings Capture Features of the Word Words that are different should correspond to points that are far away (or equivalently, to scores that are significantly different).Words that are similar should correspond to points that are close by (or equivalently, to scores that are similar). įor the sake of redundancy, let’s enumerate some properties that a nice word embedding should have: In this way, the word “Apple” is assigned to the numbers, and the word “Bicycle” to the coordinates. And what are the numbers we are assigning to each word? Simply the horizontal and vertical coordinates of the location of the word. What I would do, is locate it in point C, because it would make sense to have the word “Apple” close to the words “Banana”, “Strawberry”, and “Cherry”, and far from the other words such as “House”, “Car”, or “Tennis”. Now, the question is, where would you locate the word “Apple” in this plane? There are many places it could go, but I’m allowing 3 possibilities labeled A, B, and C. In Figure 1, I have located 12 words in the plane. What is a Word Embedding? Small Example and Quizīefore we get into what is a word embedding, let me test your intuition. We can think of a word embedding as an assignment of scores to the words, with some nice properties (that we’ll learn soon). How can we turn words and sentences into numbers in a coherent way? An assignment of words to numbers is called a word embedding. Humans speak in words and sentences, but computers only understand and process numbers. However, there is a big disconnection there. The quintessential task of natural language processing (NLP) is to understand human language. At the time, getting computers to understand and produce language seemed like an impossible task, but the latest large language models (LLM) are able to do this in a way that makes it almost impossible for a human to tell if they are talking to another human, or to a computer. In old futuristic movies, such as the 2001 Space Odyssey, the main computer (HAL) was able to talk to humans and understand what they would say with great ease. This article's title and TL DR have been generated with Cohere. They are the basic building block of most language models. Word embeddings and sentence embeddings are the bread and butter of LLMs. Here is a very simple introduction to what they are. Word and sentence embeddings are the bread and butter of language models.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |