Print traceback call stack in Python

Print traceback call stack in Python

Print Python stack traces without raising an exception.

# in /path/to/a/module.py
import traceback
for line in traceback.format_stack():
    print(line.strip())

# or

print('----- start -----')
import traceback; traceback.print_stack()
print('----- end -----')
>>> import qingcloud.iaas
>>> conn = qingcloud.iaas.connect_to_zone('pek2', '123', '456')
>>> conn.describe_instances(limit=1)
----- start -----
  File "<stdin>", line 1, in <module>
  File "qingcloud/iaas/connection.py", line 214, in describe_instances
    return self.send_request(action, body)
  File "qingcloud/iaas/connection.py", line 42, in send_request
    resp = self.send(url, request, verb)
  File "qingcloud/conn/connection.py", line 245, in send
    request.authorize(self)
  File "qingcloud/conn/connection.py", line 156, in authorize
    connection._auth_handler.add_auth(self, **kwargs)
  File "qingcloud/conn/auth.py", line 118, in add_auth
    import traceback; traceback.print_stack()
----- end -----

ref:
https://stackoverflow.com/questions/3925248/print-python-stack-trace-without-exception-being-raised

Find circular imports in Python

Find circular imports in Python

What is circular imports?
http://stackabuse.com/python-circular-imports/

You could use python -vv to inspect import relations.

$ python -vv manage.py shell
>>> from api.models import Application
>>> from member.views.site import signup

or

$ python -vv
>>> import os
>>> os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'streetvoice.settings')
>>> import django
>>> django.setup()
>>> from api.models import Application

ref:
https://stackoverflow.com/questions/6351805/cyclic-module-dependencies-and-relative-imports-in-python
https://stackoverflow.com/questions/9098787/tool-for-pinpointing-circular-imports-in-python-django

Python 3.7 has new feature to show time for importing modules. This feature is enabled with -X importtime option or PYTHONPROFILEIMPORTTIME=1 environment variable.

$ python3.7 -X importtime -c "import pipenv"

ref:
https://dev.to/methane/how-to-speed-up-python-application-startup-time-nkf

碼天狗週刊 第 120 期 @vinta - Python, Docker, Kubernetes, Code Review

碼天狗週刊 第 120 期 @vinta - Python, Docker, Kubernetes, Code Review

本文同步發表於 CodeTengu Weekly - Issue 120

Remotely debug a Python app inside a Docker container in Visual Studio Code

之前因為開發環境都 containerized 了,但是偶爾又想要用一下 Visual Studio Code 的 Python debugger,所以就試了一下 Remote Debugging 功能,順便寫了一篇網誌記錄了跟 Flask 的 --reload 併用的奇技淫巧。忍不住抱怨一下,VS Code 的 Python debugger 雖然很炫炮,尤其是 "debug.inlineValues": true,但是啟動速度真的慢到靠北啊。

不過就像 @WanCW‏ 說的,與其一直花時間追求開發環境和正式環境的一致,整天搞那些 configs,還不如用那些時間多寫幾個 test cases 啊。

P.S. 這裡的 remote 主要指的是本機的某個 Docker container,當然也可以真的是遠端的某台機器。

kennethreitz/setup.py: A Human's Ultimate Guide to setup.py

也是出自 @kennethreitz 大神之手,不過這次不是什麼改變 Python 生態系的新專案,而是一個 setup.py 的範例,有自己的 Python package 的人可以參考看看。看了之後才知道,原來有 cmdclass 這麼一個用法啊。

Kubernetes Deconstructed: Understanding Kubernetes by Breaking It Down

在你讀完八百多頁的 Kubernetes 文件之後,這個影片可能會是個很好的複習。

Kubernetes NodePort vs LoadBalancer vs Ingress? When should I use what?

看了半天的 Kubernetes 官方文件,結果反而是在讀了這篇文章之後才終於搞清楚 ServiceIngress 的使用場景,嘖。

Getting started with Data Engineering

這篇文章概略地提到了作為一個 Data Engineer 可能會需要碰到的各項技術,一個人要完全掌握這些東西當然是不太可能啦,但是看了那個 Big Data Landscape 的圖還是覺得很血汗啊,大家感受一下。

不過讀完文章之後,最大的心得反而是所謂的 Data Engineer 做的事情跟 Web Backend Developer 其實很像啊,都是圍繞著 Data 本身打轉,在技術上也有相當大的重疊,最大的差別可能是在於人家用的那套 framework 叫做 Apache Spark,逼搞夏趴。

How to Do Code Reviews Like a Human (Part One)

雖然 code review 已經是老生常談了,但是這篇文章寫得真的是好。無論你是那個覺得 code review 是件苦差事的 reviewer 或是老是被要求改 code 改到往心裡去的 reviewee,又或者你們團隊根本還沒開始做 code review,本文都非常值得一讀。

延伸閱讀:

YouGlish - Use YouTube to improve your English pronunciation

因為公司的 System Architect 是加拿大人,所以平常跟他在開會或閒聊都是全英文進行,對土台灣人如我多少還是有點不習慣,最近的困擾是關於某些單字的唸法,一般的英文單字也就算了,畢竟網路上隨便一個字典都有真人發音。但是那些我們平常在用的技術詞彙(例如 framework 或 library 的名字)就不一定在字典裡找得到了。正好前陣子 @SammyLinTw 推薦了 YouGlish 這個網站,它可以直接幫你列出 YouTube 上關於某個單字的發音片段,實用!

大家感受一下:

延伸閱讀:

瑞克和莫蒂:我酗酒、我变态,但我是个好外公

(該文有大量劇透)

在 Twitter 上看到有人分享了 dannvix/NflxMultiSubs,幫 Netflix 加上「雙語字幕」功能的瀏覽器外掛。突然就想到前一陣子因為椎間盤突出在家裡休養的那段日子,除了物理治療和皮拉提斯之外,其他時間好像都是在看 Netflix 啊。說到這裡,忍不住要跟大家分享一下我印象最深的一部作品,Rick and Morty。不誇張,這部真的是我這輩子看過最屌的科幻動畫,有點像是把銀河便車指南裡的英式幽默用 Futurama 和辛普森家庭取代的感覺,第一次看就看傻了我。非常推薦!每一集的片尾都有一段彩蛋,千萬不要錯過了。

不過如果你真的不喜歡動畫,那也可以看 Dirk Gently's Holistic Detective Agency,原著小說的作者正是 Douglas Adams。

@vinta 分享。

sindresorhus/awesome-scifi: Sci-Fi worth consuming

科幻迷大放送!

延伸閱讀:

@vinta 分享。

Remotely debug a Python app inside a Docker container in Visual Studio Code

Remotely debug a Python app inside a Docker container in Visual Studio Code

Visual Studio Code with Python extension has "Remote Debugging" feature which means you could attach to a real remote host as well as a container on localhost.

In this article, we are going to debug a Flask app inside a local Docker container through VS Code's fancy debugger, and simultaneously we are still able to leverage Flask's auto-reloading mechanism. It should apply to other Python apps.

ref:
https://code.visualstudio.com/docs/editor/debugging
https://code.visualstudio.com/docs/python/debugging#_remote-debugging

Install

On both host OS and the container, install ptvsd==3.0.0. Currently, later versions of PTVSD are experimentally supported.

$ pip3 install ptvsd==3.0.0

ref:
https://github.com/Microsoft/ptvsd
https://github.com/Microsoft/vscode-python/projects/6

Prepare

There are some configurations.

# Dockerfile
FROM python:3.6.4-alpine3.6 AS builder

WORKDIR /usr/src/app/

RUN apk add --no-cache --virtual .build-deps \
    build-base \
    openjpeg-dev \
    openssl-dev \
    zlib-dev

COPY requirements.txt .
RUN pip install --user -r requirements.txt

FROM python:3.6.4-alpine3.6

ENV PATH=$PATH:/root/.local/bin
ENV FLASK_APP=app.py

WORKDIR /usr/src/app/

RUN apk add --no-cache --virtual .run-deps \
    openjpeg \
    openssl

EXPOSE 8000/tcp

COPY --from=builder /root/.local/ /root/.local/
COPY . .
# docker-compose.yml
version: '3'
services:
    db:
        image: mongo:3.4
        ports:
            - "27017:27017"
        volumes:
            - mongo-volume:/data/db
    web:
        build: .
        command: .docker-assets/start-web.sh
        ports:
            - "3000:3000"
            - "8000:8000"
        volumes:
            - .:/usr/src/app
            - ../vendors:/root/.local
        depends_on:
            - db
volumes:
    mongo-volume:

Usage

Method 1: Debug with --no-debugger, --reload and --without-threads

The convenient but a little fragile way: with auto-reloading enabled, you could change your source code on the fly. However, you might find that this method is much slower for the debugger to attach. It seems like --reload is not fully compatible with Remote Debugging.

We put ptvsd code to sitecustomize.py, as a result, ptvsd will run every time auto-reloading is triggered.

Steps:

  1. Set breakpoints
  2. Run your Flask app with --no-debugger, --reload and --without-threads
  3. Start the debugger with {"type": "python", "request": "attach", "preLaunchTask": "Enable remote debug"}
  4. Add ptvsd code to site-packages/sitecustomize.py by the pre-launch task automatically
  5. Click "Debug Anyway" button
  6. Access the part of code contains breakpoints
# site-packages/sitecustomize.py
try:
    import socket
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.close()
    import ptvsd
    ptvsd.enable_attach('my_secret', address=('0.0.0.0', 3000))
    print('ptvsd is started')
    # ptvsd.wait_for_attach()
    # print('debugger is attached')
except OSError as exc:
    print(exc)

ref:
https://docs.python.org/3/library/site.html

# .docker-assets/start-web.sh
rm -f /root/.local/lib/python3.6/site-packages/sitecustomize.py
pip3 install --user -r requirements.txt ptvsd==3.0.0
python -m flask run -h 0.0.0.0 -p 8000 --no-debugger --reload --without-threads
// .vscode/tasks.json
{
    "version": "2.0.0",
    "tasks": [
        {
            "label": "Enable remote debug",
            "type": "shell",
            "isBackground": true,
            "command": " docker cp sitecustomize.py project_web_1:/root/.local/lib/python3.6/site-packages/sitecustomize.py"
        }
    ]
}

ref:
https://code.visualstudio.com/docs/editor/tasks

// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Attach",
            "type": "python",
            "request": "attach",
            "localRoot": "${workspaceFolder}",
            "remoteRoot": "/usr/src/app",
            "port": 3000,
            "secret": "my_secret",
            "host": "localhost",
            "preLaunchTask": "Enable remote debug"
        }
    ]
}

ref:
https://code.visualstudio.com/docs/editor/debugging#_launch-configurations

Method 2: Debug with --no-debugger and --no-reload

The inconvenient but slightly reliable way: if you change any Python code, you need to restart the Flask app and re-attach debugger in Visual Studio Code.

Steps:

  1. Set breakpoints
  2. Add ptvsd code to your FLASK_APP file
  3. Run your Flask app with --no-debugger and --no-reload
  4. Start the debugger with {"type": "python", "request": "attach"}
  5. Access the part of code contains breakpoints
# in app.py
import ptvsd
ptvsd.enable_attach('my_secret', address=('0.0.0.0', 3000))
print('ptvsd is started')
# ptvsd.wait_for_attach()
# print('debugger is attached')

ref:
http://ramkulkarni.com/blog/debugging-django-project-in-docker/

# .docker-assets/start-web.sh
pip3 install --user -r requirements.txt ptvsd==3.0.0
python -m flask run -h 0.0.0.0 -p 8000 --no-debugger --no-reload
// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Attach",
            "type": "python",
            "request": "attach",
            "localRoot": "${workspaceFolder}",
            "remoteRoot": "/usr/src/app",
            "port": 3000,
            "secret": "my_secret",
            "host": "localhost"
        }
    ]
}

Method 3: Don't use Remote Debugging, Run Debugger Locally

You just run your Flask app on localhost (macOS) instead of putting it in a container. However, you could still host your database, cache server and message queue inside containers. Your Python app communicates with those services through ports which exposed to 127.0.0.1. Therefore, you could just use VS Code's debugger without strange tricks.

In practice, it is okay that your local development environment is different from the production environment.

# docker-compose.yml
version: '3'
services:
    db:
        image: mongo:3.6
        ports:
            - "27017:27017"
        volumes:
            - mongo-volume:/data/db
    cache:
        image: redis:4.0
        ports:
            - "6379:6379"
volumes:
    mongo-volume:
// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Flask",
            "type": "python",
            "request": "launch",
            "stopOnEntry": false,
            "pythonPath": "${config:python.pythonPath}",
            "module": "flask",
            "cwd": "${workspaceFolder}",
            "args": [
                "run",
                "-h",
                "0.0.0.0",
                "-p",
                "8000",
                "--no-debugger",
                "--no-reload"
            ],
            "envFile": "${workspaceFolder}/.env",
            "debugOptions": [
                "RedirectOutput"
            ]
        }
    ]
}

Sadly, you cannot use --reload while launching your app in the debugger. Nevertheless, most of the time you don't really need the debugger - a fast auto-reloading workflow is good enough. All you need is a Makefile for running Flask app and Celery worker on macOS: make run_web and make run_worker.

# Makefile
install:
    pipenv install
    pipenv run pip install git+https://github.com/gorakhargosh/watchdog.git

shell:
    pipenv run python -m flask shell

run_web:
    pipenv run python -m flask run -h 0.0.0.0 -p 8000 --debugger --reload

run_worker:
    pipenv run watchmedo auto-restart -d . -p '*.py' -R -- celery -A app:celery worker -l info -E -P gevent -Ofair

Bonus

You should try enabling debug.inlineValues which shows variable values inline in editor while debugging. It's awesome!

// settings.json
{
    "debug.inlineValues": true
}

ref:
https://code.visualstudio.com/updates/v1_9#_inline-variable-values-in-source-code

Issues

Starting the Python debugger is fucking slow
https://github.com/Microsoft/vscode-python/issues/106

Debugging library functions won't work currently
https://github.com/Microsoft/vscode-python/issues/111

Pylint for remote projects
https://gist.github.com/IBestuzhev/d022446f71267591be76fb48152175b7

Spark troubleshooting

Spark troubleshooting

Apache Spark 2.x Troubleshooting Guide
https://www.slideshare.net/jcmia1/a-beginners-guide-on-troubleshooting-spark-applications
https://www.slideshare.net/jcmia1/apache-spark-20-tuning-guide

Check your cluster UI to ensure that workers are registered and have sufficient resources

PYSPARK_DRIVER_PYTHON="jupyter" \
PYSPARK_DRIVER_PYTHON_OPTS="notebook --ip 0.0.0.0" \
pyspark \
--packages "org.xerial:sqlite-jdbc:3.16.1,com.github.fommil.netlib:all:1.1.2" \
--driver-memory 4g \
--executor-memory 20g \
--master spark://TechnoCore.local:7077
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

可能是你指定的 --executor-memory 超過了 worker 的 memory。

你可以在 Spark Master UI http://localhost:8080/ 看到各個 worker 總共有多少 memory 可以用。如果每台 worker 可以用的 memory 容量不同,Spark 就只會選擇那些 memory 大於 --executor-memory 的 workers。

ref:
https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application

SparkContext was shut down

ERROR Executor: Exception in task 1.0 in stage 6034.0 (TID 21592)
java.lang.StackOverflowError
...
ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(55,1494185401195,JobFailed(org.apache.spark.SparkException: Job 55 cancelled because SparkContext was shut down))

可能是 executor 的記憶體不夠,導致 Out Of Memory (OOM) 了。

ref:
http://stackoverflow.com/questions/32822948/sparkcontext-was-shut-down-while-running-spark-on-a-large-dataset

Container exited with a non-zero exit code 56 (or some other numbers)

WARN org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1504241464590_0001_01_000002 on host: albedo-w-1.c.albedo-157516.internal. Exit status: 56. Diagnostics: Exception from container-launch.
Container id: container_1504241464590_0001_01_000002
Exit code: 56
Stack trace: ExitCodeException exitCode=56:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
    at org.apache.hadoop.util.Shell.run(Shell.java:869)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

Container exited with a non-zero exit code 56

可能是 executor 的記憶體不夠,導致 Out Of Memory (OOM) 了。

ref:
http://stackoverflow.com/questions/39038460/understanding-spark-container-failure

Exception in thread "main" java.lang.StackOverflowError

Exception in thread "main" java.lang.StackOverflowError
    at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at scala.collection.immutable.List$SerializationProxy.writeObject(List.scala:468)
    at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    ...

解決辦法:

import org.apache.spark.ml.recommendation.ALS
import org.apache.spark.sql.SparkSession

val spark: SparkSession = SparkSession.builder().getOrCreate()
val sc = spark.sparkContext
sc.setCheckpointDir("./spark-data/checkpoint")

// 因為 sc.setCheckpointDir() 就會啟用 checkpoint 了
// 所以可以不用特別指定 checkpointInterval
val als = new ALS()
  .setCheckpointInterval(2)

ref:
https://stackoverflow.com/questions/31484460/spark-gives-a-stackoverflowerror-when-training-using-als
https://stackoverflow.com/questions/35127720/what-is-the-difference-between-spark-checkpoint-and-persist-to-a-disk

Randomness of hash of string should be disabled via PYTHONHASHSEED

解決辦法:

$ cd $SPARK_HOME
$ cp conf/spark-env.sh.template conf/spark-env.sh
$ echo "export PYTHONHASHSEED=42" >> conf/spark-env.sh

ref:
https://issues.apache.org/jira/browse/SPARK-13330

It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion

Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

因為 spark.sparkContext 只能在 driver program 裡存取,不能被 worker 存取(例如那些丟給 RDD 執行的 lambda function 或是 UDF 就是在 worker 上執行的)。

ref:
https://spark.apache.org/docs/latest/rdd-programming-guide.html#passing-functions-to-spark
https://engineering.sharethrough.com/blog/2013/09/13/top-3-troubleshooting-tips-to-keep-you-sparking/

Spark automatically creates closures:

  • for functions that run on RDDs at workers,
  • and for any global variables that are used by those workers.

One closure is send per worker for every task. Closures are one way from the driver to the worker.

ref:
https://gerardnico.com/wiki/spark/closure

Unable to find encoder for type stored in a Dataset

Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases. someDF.as[SomeCaseClass]

解決辦法:

import spark.implicits._

yourDF.as[YourCaseClass]

ref:
https://stackoverflow.com/questions/38664972/why-is-unable-to-find-encoder-for-type-stored-in-a-dataset-when-creating-a-dat

Task not serializable

Caused by: java.io.NotSerializableException: Settings
Serialization stack:
    - object not serializable (class: Settings, value: [email protected])
    - field (class: Settings$$anonfun$1, name: $outer, type: class Settings)
    - object (class Settings$$anonfun$1, <function1>)
Caused by: org.apache.spark.SparkException:
    Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)

通常是你在 closure functions 裡使用了 driver program 裡的某個 object,因為 Spark 會自動 serialize 那個被引用的 object 一起丟給 worker node 執行,所以如果那個 object 或是 class 沒辦法被 serialize,就會出現這個錯誤。

ref:
https://www.safaribooksonline.com/library/view/spark-the-definitive/9781491912201/ch04.html#user-defined-functions
http://www.puroguramingu.com/2016/02/26/spark-dos-donts.html
https://stackoverflow.com/questions/36176011/spark-sql-udf-task-not-serialisable
https://stackoverflow.com/questions/22592811/task-not-serializable-java-io-notserializableexception-when-calling-function-ou
https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/troubleshooting/javaionotserializableexception.html
https://mp.weixin.qq.com/s/BT6sXZlHcufAFLgTONCHsg

如果你只有在 Databricks Notebook 裡遇到這個錯誤,因為 Notebook 的運作機制跟一般的 Spark application 稍微有點不同,你可以試試 package cell。

ref:
https://docs.databricks.com/user-guide/notebooks/package-cells.html

java.lang.IllegalStateException: Cannot find any build directories.

java.lang.IllegalStateException: Cannot find any build directories.
    at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248)
    at org.apache.spark.launcher.AbstractCommandBuilder.getScalaVersion(AbstractCommandBuilder.java:240)
    at org.apache.spark.launcher.AbstractCommandBuilder.buildClassPath(AbstractCommandBuilder.java:194)
    at org.apache.spark.launcher.AbstractCommandBuilder.buildJavaCommand(AbstractCommandBuilder.java:117)
    at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:39)
    at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:45)
    at org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:63)
    at org.apache.spark.deploy.worker.CommandUtils$.buildProcessBuilder(CommandUtils.scala:51)
    at org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:145)
    at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)

可能的原因是沒有設置 SPARK_HOME 或是你的 launch script 沒有讀到該環境變數。