Let us assume the existence of A and B, two points connected by a straight line which, if it extends beyond B, lets us imagine C, a point located anywhere on this imaginary line. C, then, can be said “in the continuity” of A and B. We’ll arrive at point C if we do not deviate. C will happen […] Continue Reading…
Text2Landscape: Visualize a Text in Multiple Spaces with R — Force-directed networks, Biofabric, Word Embeddings, Principal Component Analysis and Self-Organizing Maps
You will find no realistic landscapes prior to the Renaissance. The saints of medieval murals float in a conceptual space informed by hierarchies and symbolic relations; so do those of the Prajñāpāramitā Sūtras. The word “landscape” appears with the Dutch painters of the 15th century. A landscape is a part of the world perceived by a human being at […] Continue Reading…
Execute SPARQL chunks in R Markdown
Coding in R is useless without interesting research questions; and even the best questions remain unanswered without data. RStudio provides a number of convenient ways to acces data, among which the possibility to write SQL code chunks in Rmarkdown, to run these chunks and to assign the value of the query result directly to a variable of your choice. […] Continue Reading…
Text Mining: Detect Strings: Word Lookup in a Large Corpus of Phrases Using a Large Dictionary with Julia
After achieving an optimized string detection algorithm in R for 1 milion phrases using a 200k large dictionary, I wondered if I can get better results in Julia. My first attempt at this was a catastrophe. Within 24 hours, the Julia community helped me to learn some basics of Julia code optimization and proposed a blazing fast translation of […] Continue Reading…
Ideograph – explore ideologies of political parties with SPAQRL requests to WikiData, D3 and PixiJS.
Ideograph is a visual tool for exploring ideologies of political parties. It queries its data directly from the frequently updated WikiData graph database. You can filter the graph by country, and find further information by clicking on the node labels.
Ideograph is licensed under GNU GPL 3.0.
ideograph
Presentation Video
I’ve presented IdeoGraph at the Wikidata for Civil Tech session. Watch the 5 […] Continue Reading…
Unify the extent of rasters in QGIS 3 to avoid clipping by raster calculator
This one drove me crazy today! If you try to sum values from rasters with different extent, the raster calcultor clips the result to their overlapping zone. This might make sense in most cases, but in many other cases it absolutely does not. Users should have a choice.
An old workaround (2013) is proposed on a page called Raster extent […] Continue Reading…
Visualiser des réseaux géographiques
Les réseaux sont “l’autre” espace géographique, souvent négligés dans les approches classiques trop focalisées sur l’espace topographique des étendues euclidiennes des surfaces projetées du globe. Dans cet exercice, nous allons examiner comment les réseaux peuvent être formalisés sous formes de données informatiques et comment ces dernières peuvent être analysés visuellement.
La notion mathématique de graphe, vue en cours, est […] Continue Reading…
Cleaning up PDFs of pre-1990s scanned texts for text mining in R with Quanteda
Text sources are often PDF’s. If optical character recognition (OCR) has been applied, the pdftools R package allows you to extract text from all PDFs to text files stored in a folder. The readtext package converts the set of text files into something useful for Quanteda. Nevertheless, some cleaning is necessary before transforming your text into a useful corpus. […] Continue Reading…
Visualiser des données avec Orange Data Mining
Cet exercice sert à vous introduire à la visualisation de données avec le logiciel Orange Data Mining développé par l’Université de Ljubljana. Ce logiciel de “programmation visuelle” vous permet de programmer un processus d’analyse sans écrire de code. Orange Data Mining permet à ce titre de se familiariser avec des concepts de traitement informatique de données. N’hésitez pas à […] Continue Reading…
Stacked histogram with bivariate colored bars in R
A histogram gives you counts of elements within spefic ranges of a variable, represented as bars. Sometimes, you want to see more than bars. The following code allows you to represent a second variable with a color shade:
library(ggplot2)
library(data.table)
# create an example of a table
d <- data.table(
slope = round(rnorm(50),50),
p = sample(1:50,50)/1000
)
# discretize continuous values
d$midp <- floor(d$p*100)/100
d$midslope <- floor(d$slope*10)/10
d$midp […] Continue Reading…