I have an endpoint that receives a batch of 10k-20k records. It returns a job id and launches deferred tasks to process these in parallel. It seems that sometimes one of the new instances will grab a few of the tasks, but not actually process them. It seems that the instance died instantly.
Eventually those tasks hit their 10 minute timeout and are launched again.
If I find one of these tasks and filter by the id of instance that was running it, this is what I see in Google Logs Viewer:
Most of the log entries just have this message "Process terminated because the request deadline was exceeded during a loading request." The message's timestamp is 10 minutes after the timestamp of the request.
One has this stack trace:
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 283, in _update_configs
self._lock.acquire()
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/threading.py", line 170, in acquire
self.__count = self.__count + 1
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
Another has this one:
(/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py:252)
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 287, in _update_configs
self._registry.initialize()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 160, in initialize
import_func(self._modname)
File "/base/data/home/apps/s~myappid/dev.403063962077465992/appengine_config.py", line 12, in <module>
vendor.add('lib')
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/vendor/__init__.py", line 40, in add
elif os.path.isdir(path):
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/genericpath.py", line 52, in isdir
return stat.S_ISDIR(st.st_mode)
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
Another has this one:
(/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py:252)
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 287, in _update_configs
self._registry.initialize()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 160, in initialize
import_func(self._modname)
File "/base/data/home/apps/s~myappid/dev.403063962077465992/appengine_config.py", line 14, in <module>
from lib import requests
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/__init__.py", line 52, in <module>
from .packages.urllib3.contrib import pyopenssl
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/__init__.py", line 27, in <module>
from . import urllib3
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/__init__.py", line 8, in <module>
from .connectionpool import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/connectionpool.py", line 29, in <module>
from .connection import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/connection.py", line 39, in <module>
from .util.ssl_ import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/__init__.py", line 3, in <module>
from .connection import is_connection_dropped
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/connection.py", line 145, in <module>
HAS_IPV6 = _has_ipv6('::1')
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/connection.py", line 135, in _has_ipv6
sock.bind((host, 0))
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/socket.py", line 227, in meth
return getattr(self._sock,name)(*args)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/remote_socket/_remote_socket.py", line 663, in bind
self._CreateSocket(bind_address=address)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/remote_socket/_remote_socket.py", line 609, in _CreateSocket
'remote_socket', 'CreateSocket', request, reply)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 95, in MakeSyncCall
return stubmap.MakeSyncCall(service, call, request, response)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 329, in MakeSyncCall
rpc.CheckSuccess()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 133, in CheckSuccess
elif self.exception:
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 136, in exception
@property
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.
The main issue is that I need to finish processing the batch within 5-10 minutes.
Each record in the batch should only take a minute to process, so a solution would be modifying the 10 minute timeout, but Google support said that isn't possible.
I tried implementing warmpup requests to try to address the loading requests, but that seemed to have no impact.
See Question&Answers more detail:
os