Text2Landscape: Visualize a Text in Multiple Spaces with R — Force-directed networks, Biofabric, Word Embeddings, Principal Component Analysis and Self-Organizing Maps

You will find no realistic landscapes prior to the Renaissance. The saints of medieval murals float in a conceptual space informed by hierarchies and symbolic relations; so do those of the Prajñāpāramitā Sūtras. The word “landscape” appears with the Dutch painters of the 15th century. A landscape is a part of the world perceived by […]

Text Mining: Detect Strings: Word Lookup in a Large Corpus of Phrases Using a Large Dictionary with Julia

After achieving an optimized string detection algorithm in R for 1 milion phrases using a 200k large dictionary, I wondered if I can get better results in Julia. My first attempt at this was a catastrophe. Within 24 hours, the Julia community helped me to learn some basics of Julia code optimization and proposed a […]

Text Mining: Detect Strings: Very Fast Word Lookup in a Large Dictionary in R with data.table and matrixStats

Looking up words in dictionaries is the alpha and omega of text mining. I am, for instance interested to know whether a given word from a large dictionary (>100k words) occurs in a sentence or not, for a list of over 1M sentences. The best take at this task is using the Julia language, but […]