2. Introduction, and Getting Set Up

GETTING SET UP (Install Java, Spark, Scala, Eclipse)

Install A Java Development Kit(JDK)

Install Spark (pre-built)

  • Choose a Spark release later than or equal to 2.0.0

  • Choose a package type: Pre-built for Hadoop 2.7 and later

  • Download the Spark tgz file (Unix compression format on Windows)

  • So you might have to download a 3rd party tgz programme on Windows to uncompress tgz file

  • Open the tgz spark tgz file and extract to this folder, copy all the files inside to a new folder spark in (C:) drive

  • Change the configuration file for spark, go to spark/conf and change log4j.properties.template to log4j.properties

  • Open log4j.properties with whatever word editor you have (I uses Sublime 3)

  • Change the following settings

    # Set everything to be logged to the console
    log4j.rootCategory=ERROR, console
  • Install winutils.exe and HADOOP_HOME

  • Create a new folder call winutils, and create a new folder called bin in (C:) drive

  • Copy winutils.exe into C:/winutils/bin

Set Up SPARK_HOME JAVA_HOME And PATH environment Variables

  • Setting up the Windows environment, right click on the windows icon on the left hand corner and go into Control Panel

  • Click on Systems and Security, then onto System and then Advanced system settings. Click on Environment Variables

  • Click on New under User Variables for User

  • Input in the following details for New User Variable

  • Now click Edit on the Path under User Variables and for User and click on New for the following environment variables

  • Press okay for all of the settings

Install Scala IDE (bundled with Eclipse)

  • Extract and copy the eclipse folder to a new folder eclipse in (C:/)

  • Create a Desktop shortcut to eclipse.exe in C:/eclipse/eclipse.exe

  • Open up a Windows Command Prompt in Adminstrator

  • To exit just hit Ctrl-D

Detailed, Written Steps At SunDog Website

Last updated