Premiers pas avec le module R “Quanteda” pour l’analyse linguistique

Cet exercice a pour but de vous familiariser avec le module Quanteda pour l’analyse linguistique. Il présuppose que vous avez fait les premiers pas avec R et Rstudio. Installez et activez les modules Installez les modules quanteda, quanteda.textstats, quanteda.textplots, readtext, ggplot2 et udpipe: Créez un nouveau script R pour sauvegarder la progression de votre travail. […]

Reorder geom_bar or anything else in ggplot by the value of your choosing

You still find recent answers on StackOverflow counseling to redefine the factors of a data.frame to reorder elements of a ggplot graphic. In the 2020’s, this can be avoided. Factors are a heritage of a numerical focus of R, when text values were seen as an anomaly or, at best, as ordinal values. I highly […]

Scrape an image from DeepZoom with R and magick, recomposing a single image from multiple tiles

DeepZoom allows webmasters to display high resolution images in an online viewer. Among its users: The British Library The World Digital Library (WDL) Polona, the Polish Digital National Library BALaT, Belgian Art Links and Tools and many others DeepZoom mostly discourages downlading the original high resolution images to your local drive. This is how to […]

Radar charts with R

Radar charts, also called spider charts, serve to compare profiles of individuals. They are most useful if every profile is compared to an average profile. They are most pertinent when the order of the axis has an inherent sense, such as cardinal directions, the surroundings of an individual (the level of noise from left, right, […]

Workshop Poliphilo 2: seconde partie

Ceci est la seconde édition du workshop créé par l’auteur pour le laboratoire d’architecture ALICE (EPFL-ENAC) dans le cadre du projet de recherche et d’enseignement POLIPHILO. Le premier workshop (2019) demeure disponible en ligne. Si vous n’avez pas assisté à la séance du 1e avril, consultez la première partie du deuxième workshop. Dans cette seconde […]

COVID-19 Coronavirus cases in absolute numbers and per capita – evolution

Graphics of the evolution of the COVID-19 pandemic. Updated regularly since March 31st. Source code for graphic production included. Excellent graphics of the evolution of the COVID-19 have been made by John Burn-Murdoch for the Financial Times or by Lisa Charlotte Rost for the Grand Continent (observatoire Coronavirus : tendances globales). Not only use they […]

Draw anything you want with R on the Cartesian grid

Most of the graphics documenation in R is dedicated to high level plotting packages. Sometimes, these packages remind me of LaTeX: you spend less time producing graphical elements you need than getting rid of elements you have not asked for. Some of these packages are great, of course, and you should definitely use them for […]

Repair a pandoc-generated LaTeX table with R

Pandoc is a great piece of software but it is not always kind to HTML tables when converting to LaTeX. Especially tables containing <tr> elements with rowspan or <td> elements with colspan attributes end up as sequences of lines of text, not embedded in a table environment like longtableand devoid of both line endings (\\) […]

Text Mining: Very Fast Word Lookup in a Large Dictionary in R with data.table and matrixStats

Looking up words in dictionaries is the alpha and omega of text mining. I am, for instance interested to know whether a given word from a large dictionary (>100k words) occurs in a phrase or not, for a list of over 1M phrases. R can be horribly slow or quite fast at this task, depending […]

Premiers pas avec R et RStudio

Cet exercice a pour préalable d’avoir installé R et RStudio. Se familiariser avec l’interface Ouvrez RStudio. Vous devriez voir l’interface comme à l’image ci-dessous, pour l’heure sans la partie A. La partie C est en principe vide: Les fonctions de ces différentes parties sont les suivantes: A : Fenêtre d’édition des fichiers sources. Ici vous […]