Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a pre-installed Minikube instance -- porting over logic from PR 521 #14

Merged
merged 14 commits into from
Jan 12, 2018
Merged
31 changes: 9 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@ $ ./e2e/runner.sh -m https://xyz -i test -r https://github.com/my-spark/spark -d

## Running the tests using maven

Integration tests firstly require installing [Minikube](https://kubernetes.io/docs/getting-started-guides/minikube/) on
your machine, and for the `Minikube` binary to be on your `PATH`.. Refer to the Minikube documentation for instructions
on how to install it. It is recommended to allocate at least 8 CPUs and 8GB of memory to the Minikube cluster.

Running the integration tests requires a Spark distribution package tarball that
contains Spark jars, submission clients, etc. You can download a tarball from
http://spark.apache.org/downloads.html. Or, you can create a distribution from
Expand Down Expand Up @@ -82,37 +86,20 @@ In order to run against any cluster, use the following:
```sh
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master> -Dspark.docker.test.driverImage=<driver-image> -Dspark.docker.test.executorImage=<executor-image>"
```

## Preserve the Minikube VM

The integration tests make use of
[Minikube](https://github.com/kubernetes/minikube), which fires up a virtual
machine and setup a single-node kubernetes cluster within it. By default the vm
is destroyed after the tests are finished. If you want to preserve the vm, e.g.
to reduce the running time of tests during development, you can pass the
property `spark.docker.test.persistMinikube` to the test process:

```
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true
```
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master>

## Reuse the previous Docker images

The integration tests build a number of Docker images, which takes some time.
By default, the images are built every time the tests run. You may want to skip
re-building those images during development, if the distribution package did not
change since the last run. You can pass the property
`spark.docker.test.skipBuildImages` to the test process. This will work only if
you have been setting the property `spark.docker.test.persistMinikube`, in the
previous run since the docker daemon run inside the minikube environment. Here
is an example:
`spark.kubernetes.test.imageDockerTag` to the test process and specify the Docker
image tag that is appropriate.
Here is an example:

```
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
"-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true -Dspark.docker.test.skipBuildImages=true"
-Dspark.kubernetes.test.imageDockerTag=latest
```
32 changes: 0 additions & 32 deletions integration-test/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@
<slf4j-log4j12.version>1.7.24</slf4j-log4j12.version>
<sbt.project.name>kubernetes-integration-tests</sbt.project.name>
<spark-distro-tgz>YOUR-SPARK-DISTRO-TARBALL-HERE</spark-distro-tgz>
<spark-dockerfiles-dir>YOUR-DOCKERFILES-DIR-HERE</spark-dockerfiles-dir>
<test.exclude.tags></test.exclude.tags>
</properties>
<packaging>jar</packaging>
Expand Down Expand Up @@ -141,37 +140,6 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>com.googlecode.maven-download-plugin</groupId>
<artifactId>download-maven-plugin</artifactId>
<version>${download-maven-plugin.version}</version>
<executions>
<execution>
<id>download-minikube-linux</id>
<phase>pre-integration-test</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-linux-amd64</url>
<outputDirectory>${project.build.directory}/minikube-bin/linux-amd64</outputDirectory>
<outputFileName>minikube</outputFileName>
</configuration>
</execution>
<execution>
<id>download-minikube-darwin</id>
<phase>pre-integration-test</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-darwin-amd64</url>
<outputDirectory>${project.build.directory}/minikube-bin/darwin-amd64</outputDirectory>
<outputFileName>minikube</outputFileName>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<!-- Triggers scalatest plugin in the integration-test phase instead of
the test phase. -->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ import org.scalatest.concurrent.{Eventually, PatienceConfiguration}
import org.scalatest.time.{Minutes, Seconds, Span}

import org.apache.spark.deploy.k8s.integrationtest.backend.IntegrationTestBackendFactory
import org.apache.spark.deploy.k8s.integrationtest.constants.SPARK_DISTRO_PATH
import org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend
import org.apache.spark.deploy.k8s.integrationtest.constants._
import org.apache.spark.deploy.k8s.integrationtest.config._

private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll with BeforeAndAfter {

Expand Down Expand Up @@ -59,6 +61,9 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
.set("spark.kubernetes.driver.container.image", driverImage)
.set("spark.kubernetes.executor.container.image", executorImage)
.set("spark.kubernetes.driver.label.spark-app-locator", APP_LOCATOR_LABEL)
.set(DRIVER_DOCKER_IMAGE, tagImage("spark-driver"))
.set(EXECUTOR_DOCKER_IMAGE, tagImage("spark-executor"))
.set(INIT_CONTAINER_DOCKER_IMAGE, tagImage("spark-init"))
.set("spark.kubernetes.executor.label.spark-app-locator", APP_LOCATOR_LABEL)
kubernetesTestComponents.createNamespace()
}
Expand Down Expand Up @@ -97,6 +102,7 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
}

test("Run SparkPi with custom driver pod name, labels, annotations, and environment variables.") {
doMinikubeCheck
sparkAppConf
.set("spark.kubernetes.driver.pod.name", "spark-integration-spark-pi")
.set("spark.kubernetes.driver.label.label1", "label1-value")
Expand Down Expand Up @@ -217,6 +223,7 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
}
}
}
private def tagImage(image: String): String = s"$image:${testBackend.dockerImageTag()}"

private def doBasicDriverPodCheck(driverPod: Pod): Unit = {
assert(driverPod.getSpec.getContainers.get(0).getImage === driverImage)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,32 @@ object Utils extends Logging {
try f.apply(resource) finally resource.close()
}

def tryWithSafeFinally[T](block: => T)(finallyBlock: => Unit): T = {
var originalThrowable: Throwable = null
try {
block
} catch {
case t: Throwable =>
// Purposefully not using NonFatal, because even fatal exceptions
// we don't want to have our finallyBlock suppress
originalThrowable = t
throw originalThrowable
} finally {
try {
finallyBlock
} catch {
case t: Throwable =>
if (originalThrowable != null) {
originalThrowable.addSuppressed(t)
logWarning(s"Suppressing exception in finally: " + t.getMessage, t)
throw originalThrowable
} else {
throw t
}
}
}
}

def checkAndGetK8sMasterUrl(rawMasterURL: String): String = {
require(rawMasterURL.startsWith("k8s://"),
"Kubernetes master URL must start with k8s://.")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient}

import org.apache.spark.deploy.k8s.integrationtest.Utils
import org.apache.spark.deploy.k8s.integrationtest.backend.IntegrationTestBackend
import org.apache.spark.deploy.k8s.integrationtest.constants.GCE_TEST_BACKEND
import org.apache.spark.deploy.k8s.integrationtest.config._

private[spark] class GCETestBackend(val master: String) extends IntegrationTestBackend {
private var defaultClient: DefaultKubernetesClient = _
Expand All @@ -37,5 +37,7 @@ private[spark] class GCETestBackend(val master: String) extends IntegrationTestB
defaultClient
}

override def name(): String = GCE_TEST_BACKEND
override def dockerImageTag(): String = {
return System.getProperty(KUBERNETES_TEST_DOCKER_TAG_SYSTEM_PROPERTY, "latest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not generate a random ID like minikube backend code does? i.e. UUID.randomUUID().toString.replaceAll("-", "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Minikube case we're building these images from scratch. In the GCE case, we don't create a Docker manager and hence are not building the images there. But this in itself seems to contradict this section of our readme:

If you're using a non-local cluster, you must provide an image repository which you have write access to, using the -i option, in order to store docker images generated during the test.

which indicates that GCE-backed tests should be building images as well. Is this correct @foxish?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That readme section is meant to highlight that we push the images to an image repository only in the cloud testing case, and don't have to in the minikube case since the images are built in the minikube VM's docker environment. That documentation pertains only to the use of the script, which avoids using maven for building images.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem then with using a random ID tag here is that it's impossible for this tag to actually match anything. Using "latest" at least guarantees that we pick up some image in the default case.

We can be more strict here and require the tag be explicitly specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking a little closer I think the miscommunication is because the docker image manager isn't serving the image tag but is instead being handed the tag by the test backend. The responsibilities thus aren't clear and the coupling of the provision of a custom tag vs. a generated tag, and how that influences whether or not images are built or deleted, is unclear.

I'm moving the generation of the tag vs. using the user-provided one into the docker manager. This should hopefully clarify the connection.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@ import org.apache.spark.deploy.k8s.integrationtest.backend.GCE.GCETestBackend
import org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend

private[spark] trait IntegrationTestBackend {
def name(): String
def initialize(): Unit
def getKubernetesClient: DefaultKubernetesClient
def dockerImageTag(): String
def cleanUp(): Unit = {}
}

private[spark] object IntegrationTestBackendFactory {
def getTestBackend(): IntegrationTestBackend = {
Option(System.getProperty("spark.kubernetes.test.master"))
.map(new GCETestBackend(_))
.getOrElse(new MinikubeTestBackend())
.getOrElse(MinikubeTestBackend)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,74 +20,38 @@ import java.nio.file.Paths

import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient}

import org.apache.commons.lang3.SystemUtils
import org.apache.spark.deploy.k8s.integrationtest.{Logging, ProcessUtils}

// TODO support windows
private[spark] object Minikube extends Logging {
private val MINIKUBE_EXECUTABLE_DEST = if (SystemUtils.IS_OS_MAC_OSX) {
Paths.get("target", "minikube-bin", "darwin-amd64", "minikube").toFile
} else if (SystemUtils.IS_OS_WINDOWS) {
throw new IllegalStateException("Executing Minikube based integration tests not yet " +
" available on Windows.")
} else {
Paths.get("target", "minikube-bin", "linux-amd64", "minikube").toFile
}

private val EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE = "Minikube is not downloaded, expected at " +
s"${MINIKUBE_EXECUTABLE_DEST.getAbsolutePath}"

private val MINIKUBE_STARTUP_TIMEOUT_SECONDS = 60

// NOTE: This and the following methods are synchronized to prevent deleteMinikube from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are deleting this note. Maybe we don't need "synchronized" any more. Kill "synchronized" below?

// destroying the minikube VM while other methods try to use the VM.
// Such a race condition can corrupt the VM or some VM provisioning tools like VirtualBox.
def startMinikube(): Unit = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
if (getMinikubeStatus != MinikubeStatus.RUNNING) {
executeMinikube("start", "--memory", "6000", "--cpus", "8")
} else {
logInfo("Minikube is already started.")
}
}

def getMinikubeIp: String = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
def getMinikubeIp: String = {
val outputs = executeMinikube("ip")
.filter(_.matches("^\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}$"))
assert(outputs.size == 1, "Unexpected amount of output from minikube ip")
outputs.head
}

def getMinikubeStatus: MinikubeStatus.Value = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
def getMinikubeStatus: MinikubeStatus.Value = {
val statusString = executeMinikube("status")
.filter(_.contains("minikube: "))
.filter(line => line.contains("minikubeVM: ") || line.contains("minikube:"))
.head
.replaceFirst("minikubeVM: ", "")
.replaceFirst("minikube: ", "")
MinikubeStatus.unapply(statusString)
.getOrElse(throw new IllegalStateException(s"Unknown status $statusString"))
}

def getDockerEnv: Map[String, String] = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
def getDockerEnv: Map[String, String] = {
executeMinikube("docker-env", "--shell", "bash")
.filter(_.startsWith("export"))
.map(_.replaceFirst("export ", "").split('='))
.map(arr => (arr(0), arr(1).replaceAllLiterally("\"", "")))
.toMap
}

def deleteMinikube(): Unit = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists, EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
if (getMinikubeStatus != MinikubeStatus.NONE) {
executeMinikube("delete")
} else {
logInfo("Minikube was already not running.")
}
}

def getKubernetesClient: DefaultKubernetesClient = synchronized {
def getKubernetesClient: DefaultKubernetesClient = {
val kubernetesMaster = s"https://${getMinikubeIp}:8443"
val userHome = System.getProperty("user.home")
val kubernetesConf = new ConfigBuilder()
Expand All @@ -105,13 +69,8 @@ private[spark] object Minikube extends Logging {
}

private def executeMinikube(action: String, args: String*): Seq[String] = {
if (!MINIKUBE_EXECUTABLE_DEST.canExecute) {
if (!MINIKUBE_EXECUTABLE_DEST.setExecutable(true)) {
throw new IllegalStateException("Failed to make the Minikube binary executable.")
}
}
ProcessUtils.executeProcess(Array(MINIKUBE_EXECUTABLE_DEST.getAbsolutePath, action) ++ args,
MINIKUBE_STARTUP_TIMEOUT_SECONDS)
ProcessUtils.executeProcess(
Array("minikube", action) ++ args, MINIKUBE_STARTUP_TIMEOUT_SECONDS)
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,29 +19,33 @@ package org.apache.spark.deploy.k8s.integrationtest.backend.minikube
import io.fabric8.kubernetes.client.DefaultKubernetesClient

import org.apache.spark.deploy.k8s.integrationtest.backend.IntegrationTestBackend
import org.apache.spark.deploy.k8s.integrationtest.constants.MINIKUBE_TEST_BACKEND
import org.apache.spark.deploy.k8s.integrationtest.docker.SparkDockerImageBuilder
import org.apache.spark.deploy.k8s.integrationtest.config._
import org.apache.spark.deploy.k8s.integrationtest.docker.KubernetesSuiteDockerManager

private[spark] class MinikubeTestBackend extends IntegrationTestBackend {
private[spark] object MinikubeTestBackend extends IntegrationTestBackend {
private var defaultClient: DefaultKubernetesClient = _
private val userProvidedDockerImageTag = Option(
System.getProperty(KUBERNETES_TEST_DOCKER_TAG_SYSTEM_PROPERTY))
private val dockerManager = new KubernetesSuiteDockerManager(
Minikube.getDockerEnv, userProvidedDockerImageTag)

override def initialize(): Unit = {
Minikube.startMinikube()
if (!System.getProperty("spark.docker.test.skipBuildImages", "false").toBoolean) {
new SparkDockerImageBuilder(Minikube.getDockerEnv).buildSparkDockerImages()
}
val minikubeStatus = Minikube.getMinikubeStatus
require(minikubeStatus == MinikubeStatus.RUNNING,
s"Minikube must be running before integration tests can execute. Current status" +
s" is: $minikubeStatus")
dockerManager.buildSparkDockerImages()
defaultClient = Minikube.getKubernetesClient
}

override def getKubernetesClient(): DefaultKubernetesClient = {
defaultClient
override def cleanUp(): Unit = {
super.cleanUp()
dockerManager.deleteImages()
}

override def cleanUp(): Unit = {
if (!System.getProperty("spark.docker.test.persistMinikube", "false").toBoolean) {
Minikube.deleteMinikube()
}
override def getKubernetesClient(): DefaultKubernetesClient = {
defaultClient
}

override def name(): String = MINIKUBE_TEST_BACKEND
override def dockerImageTag(): String = dockerManager.dockerImageTag()
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.spark.deploy.k8s.integrationtest

package object config {
val KUBERNETES_TEST_DOCKER_TAG_SYSTEM_PROPERTY = "spark.kubernetes.test.imageDockerTag"
val DRIVER_DOCKER_IMAGE = "spark.kubernetes.driver.container.image"
val EXECUTOR_DOCKER_IMAGE = "spark.kubernetes.executor.container.image"
val INIT_CONTAINER_DOCKER_IMAGE = "spark.kubernetes.initcontainer.container.image"
}
Loading