In Chapter 4, Transforming Text into Data Structures, we discussed the bag-of-words and term-frequency and inverse document frequency-based methods to represent text in the form of numbers. These methods mostly rely on the syntactical aspects of a word in terms of its presence or absence in a document or across a text corpus. However, information about the neighborhood of the word, in terms of what words come after or before a word, wasn't taken into account in the approaches we have discussed so far. The neighborhood of a word carries important information in terms of what context the word is carrying in a sentence. The relationship between the word and its neighborhood tends to define the semantics of a word and its overall positioning and presence in a sentence. In this chapter, we will use this idea to build word vectors...





















































