Creating Hashes in R with the Hash Package
Published:
R does not provide a native hash table structure, which is unfortunate because if you need a fast and efficient way to retrieve information without worrying about element order, the hash table is a decent data structure choice. R users are not without options, though. The first option involves using an environment variable.
The downside to this is that one cannot easily use vectors as keys or values. A viable alternative, however, can be found in an R package named hash. Hash is an easy way of implementing hashes without relying on environment variables.
Using the Hash Package
As always, before you can use hash it has to be installed. Once installed, include it in your R file with library("hash")
.
To give a concrete example of how to use the hash package, imagine a vector of 10 names:
Then, create a second vector containing ages. The example below shows how to randomly generate 10 numbers between 18 and 70.
The random numbers generated for our example are 44, 40, 67, 35, 41, 53, 55, 56, 52, and 58. To map the keys (names) to the age values, use the function hash().
This should produce ouput like below:
Useful Functions
Aside from the hash() function to create the actual hash, you can use the following functions below:
keys(), to retrieve all key values within a hash.
- values(), to retrieve all values in a hash or a single value.
- Note that you can also use double square brackets or the dollar sign to access a single value by its key.
.set(), to add a new key-value pair to the hash.
has.key(), to verify the existence of a key in a hash.
- invert(), to swap keys and values. Just a note of caution here: There may be repetition of values, so swapping them for keys could lead to problems in data retrieval.
To learn more about the hash package, view its documentation at the CRAN repository .