-
Notifications
You must be signed in to change notification settings - Fork 1
Hadoop CLI on Windows
Johnny Foulds edited this page Jul 12, 2020
·
3 revisions
This page shows how to deploy Hadoop to a development machine that will be used to interact with the Hadoop cluster.
- Notepad++ - https://notepad-plus-plus.org/download/v7.7.1.html
- Java 8 - https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
- Windows Subsystem for Linux (WSL)
Variables | Value |
---|---|
JAVA_HOME | C:\PROGRA~1\Java\jdk1.8.0_211 |
HADOOP_HOME | c:\data-analytics\hadoop |
Add %JAVA_HOME%\bin
, %HADOOP_HOME%\bin
, and %HADOOP_HOME%\sbin
into Path environment variable.
PS C:\> mkdir c:\data-analytics
PS C:\> cd c:\data-analytics\
PS C:\> wget http://archive.apache.org/dist/hadoop/core/hadoop-3.1.2/hadoop-3.1.2.tar.gz
$ cd /mnt/c/data-analytics/
$ tar -xvzf hadoop-3.1.2.tar.gz
$ echo "hadoop-3.1.2" > hadoop-3.1.2/_version.txt
$ mv hadoop-3.1.2 hadoop
$ wget https://github.com/s911415/apache-hadoop-3.1.0-winutils/raw/master/bin/winutils.exe
$ wget https://github.com/s911415/apache-hadoop-3.1.0-winutils/raw/master/bin/hadoop.dll
$ mv winutils.exe hadoop/bin/
$ mv hadoop.dll hadoop/bin/
PS C:\> hdfs -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
PS C:\> hadoop fs -ls hdfs://pshp111:9000/
Edit hadoop/etc/hadoop/core-site.xml
and add the following property to fix it to the a server to not have to type it every time.
<property>
<name>fs.defaultFS</name>
<value>hdfs://pshp111:9000</value>
</property>
Upload a sample file to test if it is working:
PS C:\> hdfs dfs -put C:\Temp\IISLogs\W3SVC1291934293\u_ex190620.log /
PS C:\> hdfs dfs -ls /
val logFile = sc.textFile("hdfs://pshp111:9000/u_ex190620.log")
z.show(logFile.toDF)