HDInsight is a platform that provides a facility to provision Hadoop, Spark, Storm, HBase, Kafka clusters, and R Servers on Windows Azure.
Now, you will learn how to set up a Hadoop cluster on Windows Azure. First, you will require blob storage for provisioning the Hadoop cluster on Windows Azure. The specified blob container will act as a file system like HDFS.
How to Create Blob Storage
Log onto the Azure portal.
On the left pane of the portal, select New > Storage > Storage account.
In the Create Storage dialog, provide the cluster name and select the kind of account and its performance.
Select Replication depending on your choice. You will see three types of replication: locally redundant storage (LRS), geo-redundant storage (GRS), and read-access geo-redundant storage (RA-GRS).
Select the Subscription and Resource group where you need create your blob storage. You can create these or use existing resource groups.
Select the Location where you need to create your blob storage and then click Create.
1. Create Hadoop Cluster
On the left pane of the portal, select New > Intelligence + analytics > HDInsight.
2. Configure Basic Settings
In the basic dialog, provide Cluster name and Subscription.
Click Cluster configuration and select Cluster type to Hadoop from the drop-down. Then, change the Operating system from Windows to Version.
Provide your username and password. Provide the remote desktop username and it will require you to RDP the Hadoop cluster.
Select the existing Resource group or create a new one where you need to host your Hadoop cluster.
Select the Location where you need to create your Hadoop cluster and click Next.
3. Set Storage Settings
In the Storage dialog, you can create new storage within your subscription or select existing storage.
You will see a Default container name already filled up, but you can change this according to your requirements. Then, click Next.
4. Confirm Configurations
In the Confirmation dialog, you can customize the cluster size, applications, and settings.
Click Create. It will take up to 20 minutes to provision your Hadoop cluster on Windows Azure HDInsight.
Once the Hadoop cluster is deployed, you can click on your Hadoop cluster instance name. It will show you various featured like remote desktop, diagnose, scale cluster, etc.
Similarly, you can provision Spark, Storm, Kafka, and HBase with Windows Azure HDInsight.
Now, you know how to create blob storage and HDInsight Hadoop clusters with Windows Azure!