YARN / MapReduce

Voici quelques exemples d'exécution de jobs MapReduce disponibles sur la plateforme

Calcul du nombre pi

[jthomaze@co2-hdp26-client ~]$ hadoop jar /usr/hdp/2.6.3.0-235/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 1000

Number of Maps = 10
Samples per Map = 1000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9

Starting Job
...
Job Finished in 37.872 seconds
Estimated value of Pi is 3.14080000000000000000

Comptage de nombre de mots

[jthomaze@co2-hdp26-client ~]$ hadoop fs -mkdir in
Dans cet exemple, le répertoire in contiendra les données à traiter, en l'occurrence, le contenu du fichier /etc/hosts

[jthomaze@co2-hdp-client ~]$ hadoop fs -put -f /etc/hosts in
[jthomaze@co2-hdp-client ~]$ hadoop fs -ls in

Found 1 items
-rw-r--r-- 1 jthomaze systeme 1054 2016-01-18 14:39 in/hosts

[jthomaze@co2-hdp-client ~]$ hadoop jar /usr/hdp/2.6.3.0-235/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount in out

16/01/18 14:44:01 INFO impl.TimelineClientImpl: Timeline service address: http://co2-hdp26-master1.irit.fr:8188/ws/v1/timeline/
16/01/18 14:44:01 INFO client.RMProxy: Connecting to ResourceManager at co2-hdp26-master1.irit.fr/141.115.102.101:8050
16/01/18 14:44:03 INFO input.FileInputFormat: Total input paths to process : 1
...
File Input Format Counters
Bytes Read=1054
File Output Format Counters
Bytes Written=1122

[jthomaze@co2-hdp26-client ~]$ hadoop fs -ls out

Found 2 items
-rw-r--r-- 1 jthomaze systeme 0 2016-01-18 14:44 out/_SUCCESS
-rw-r--r-- 1 jthomaze systeme 1122 2016-01-18 14:44 out/part-r-00000

[jthomaze@co2-hdp26-client ~]$ hadoop fs -cat out/part-r-00000

127.0.0.1 1
141.115.102.100 1
141.115.102.101 1
141.115.102.102 1
...
localhost.localdomain 2
localhost4 1
localhost4.localdomain4 1
localhost6 1
localhost6.localdomain6 1