8 min read
MapR Persistent Application Client Containers (PACCs) support containerization of existing and new applications by providing containers with persistent data access from anywhere. PACCs are purposely built for connecting to MapR services. They offer secure authentication and connection at the container level, extensible support for application layer, and can be customized and published in Docker Hub.
Microsoft SQL Server 2017 for Linux offers the flexibility of running MSSQL in a Linux environment. Like all RDBMs, it also needs a robust storage platform to persist in databases where it is managed and protected securely.
By containerizing MSSQL with MapR PACCs, customers have all the benefits of MSSQL, MapR, and Docker combined. Here, MSSQL offers robust RDBM services that persist data into MapR for disaster recovery and data protection while leveraging Docker technologies for scalability and agility.
The diagram below shows the architecture for our demonstration:
A MapR Cluster
Before you can deploy the container, you need a MapR cluster for persisting data to. There are multiple ways to deploy a MapR cluster. You can use a sandbox, or you can use MapR installer for on-premises or cloud deployment. The easiest way to deploy MapR on Azure is through the MapR Azure Marketplace. Once you sign up for Azure, purchase a subscription that has enough quotas such as CPU cores and storage, and fill out a form to answer some basic questions for the infrastructure and MapR, then off you go at the click of a button. A fully deployed MapR cluster should be at your fingertips within 20 minutes.
A VM with Docker CE/EE Running
Second, you need to spin up a VM in the same VNet or subnet where your MapR cluster is located. Docker CE/EE is required. For information on how to install Docker, follow this link: https://docs.docker.com/engine/installation/. Docker supports a wide variety of OS platforms. We used CentOS for our demo.
Once you have the MapR cluster and VM running, you can kick off your container deployment.
Step 1 - Build a Docker Image
Login to your VM as root and run the following command:
curl -L https://raw.githubusercontent.com/jsunmapr/pacc-mssql/master/build | bash
In a few minutes, you should see similar message to the one below indicating a successful build:
Execute the following command to verify the image (
mapr-azure/pacc-mssql:latest) is indeed stored in the local Docker repository:
Step 2 – Create a Volume for MSSQL
Before starting up the container, you need to create a volume on the MapR cluster to persist the database into. Login to the MapR as user ‘
mapr’, run the following command to create a volume, e.g.,
vol1 mounted on path
/vol1 in the filesystem:
maprcli volume create –path /vol1 –name vol1
You can get the cluster name by executing this command:
maprcli dashboard info -json | grep name
Step 3 – Start Up the Container
Run the following command to spin up the container with the image we just built in Step 1 above:
# docker run --rm --name pacc-mssql -it \ --cap-add SYS_ADMIN \ --cap-add SYS_RESOURCE \ --device /dev/fuse \ --security-opt apparmor:unconfined \ --memory 0 \ --network=bridge \ -e ACCEPT_EULA=Y \ -e SA_PASSWORD=m@prr0cks \ -e MAPR_CLUSTER=mapr522 \ -e MSSQL_BASE_DIR=/mapr/mapr522/vol1 \ -e MAPR_CLDB_HOSTS=172.31.35.153 \ -e MAPR_MOUNT_PATH=/mapr \ -e MAPR_TZ=Etc/UTC \ -e MAPR_CONTAINER_USER=root \ -e MAPR_CONTAINER_UID=0 \ -e MAPR_CONTAINER_GROUP=root \ -e MAPR_CONTAINER_GID=0 \ -p 1433:1433 \ mapr-azure/pacc-mssql:latest
Note you can replace
–d in the first line to place the startup process running in the background.
You can customize the environment variables colored in red above to fit your environment. The variable
SA_PASSWORD is for MSSQL admin user.
MAPR_CLUSTER is the cluster name.
MSSQL_BASE_DIR is the path to MapR XD where MSSQL will be persisting its data. The path usually takes the form of
/mapr/<cluster name>/<volume name>.
MAPR_CLDB_HOSTS is the IP address of the cldb hosts in MapR cluster. In our case, we only have a single node cluster, so only one IP is used. Finally, the default MSSQL port is 1433. You can use the
–p option in docker to expose it to a port of your choice on the VM host. We selected the same port 1433 in the demo.
There are other environment variables you can pass into MapR PACC. For more information, please refer to this link:
In a few minutes, you should see a message like the one below that indicates the MSSQL server is ready:
2017-11-16 22:54:30.49 spid19s SQL Server is now ready for client connections. This is an informational message; no user action is required.
Step 4 – Create a Table in MSSQL and Insert Some Data
Now you are ready to insert some sample data into a test MSSQL database. To do so, find the container ID of the running MSSQL container by issuing this command:
Then use the docker
exec command to login to the container:
Then issue the command below to get into a MSSQL prompt by providing the admin password when you started the container as in step 3 above:
Issue the following MSSQL statements to populate an inventory table in a test database, then query the table:
Success! This means the database has been persisted into MapR volume and is now managed and protected by MapR XD storage. You can verify by issuing this command in the container, the MSSQL log, and data directories showed up in
Step 5 – Destroy Current Container and Relaunch a New Container and Access the Existing Table
Now let’s destroy the current container to simulate a server outage by issuing this command:
# docker rm –f c2e69e75b181
Repeat step 3 above to launch a new container. Login to the container and query the same inventory table right away when the new container is up and running:
With a huge relief, you see the data previously entered is still there thanks to MapR!
Step 6 – Scale It up and Beyond
With the container technology know-how in place, it is extremely easy to spin up multiple containers all at once. Simply repeat steps 2 and 3 to assign each MSSQL container a new volume in MapR and off you go.
In this blog, we demonstrated how to containerize MSSQL with MapR PACC and persist its database into MapR for data protection and disaster recovery. MapR PACCs are a great way for many other applications that require a scalable and robust storage layer to have their data managed and distributed for DR and scalability. The MapR PACCs can also be managed for deployment at scale with an orchestrator like Kubernetes, Mesos, or Docker to achieve true scalability and high availability.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.