Defer provider SDK imports and harden startup launchers#347
Conversation
|
@namabeeru Can you point this PR to the improvement/fix-provider-sdk-imports branch? |
|
Done, this PR now targets improvement/fix-provider-sdk-imports. |
Lazy-load OpenAI/Anthropic SDKs in the model factory; raise a friendly install hint when a needed SDK is missing. Clean portion of #347 at its final reviewed state. Co-authored-by: namabeeru <github.body594@passmail.com>
|
Thanks @namabeeru really nice work, and the test coverage is appreciated. 🙏 I've merged the model-factory change (the lazy OpenAI/Anthropic SDK imports) into I'd like to hold the rest of this PR for a couple of revisions before merging:
The macOS shortcut changes look fine. Once the two items above are addressed I'm happy to merge the remainder. Thanks again! |
a86a437
into
CraftOS-dev:improvement/fix-provider-sdk-imports
This reverts commit a86a437. Holding #347 pending revisions discussed on the PR: the runtime_preflight probe should warn-and-continue (not sys.exit) on an inconclusive result, and craftbot.py BACKEND_URL must derive from --backend-port instead of a hardcoded 7926. The clean model-factory change was taken separately in #348.
|
Quick follow-up @namabeeru: this PR briefly got merged into the integration branch in full by mistake, and I've now reverted that (the branch no longer carries these changes). The model-factory portion is already in via #348. Whenever you have a chance to address the two items above (the runtime_preflight warn-and-continue, and deriving the backend port from --backend-port), please open a follow-up PR and I'll get the rest in. Thanks again, and sorry for the back-and-forth! |
Summary
app.runtime_preflight) used by bothrun.pyand directapp.mainstartup. It checks startup-critical imports in the exact Python runtime that will run the backend and reports wrong-interpreter installs before the traceback path.craftbot.py startdetect early child-process failure, remove stale PID files, print the relevant log tail instead of reporting false success, and return a nonzero CLI exit code on failedstart/restart.run.pyalready checked the runtime, ignore staleCRAFTBOT IS READYlines from previous logs, and make install flows fail instead of claiming success when the service does not start.app.mainpreflight ordering, service startup failure reporting, source shortcut generation, stale readiness logs, install failure reporting, and CLI failure exit codes.Root cause
There are four related failure modes:
agent_core.core.models.factoryimportedopenaiandanthropicat module load, so backend startup could fail before the selected provider mattered.install.pydependencies. The crash then appears deep inside backend imports after frontend startup has already begun.craftbot.py startwrote the child PID and printedCRAFTBOT STARTEDimmediately after spawningrun.py, even if the child exited right away. That left users with a desktop shortcut to a server that was not actually running. Loop testing also found the failedstartcommand still exited0; this PR changes failedstart/restartCLI calls to exit1.CraftBot.commandonly openedhttp://localhost:7925; it did not start the service if nothing was listening there, and it could also open a frontend-only/stale service without verifying the backend.Why this approach
This avoids introducing a parallel OpenAI-compatible HTTP client. The project already declares and uses the OpenAI SDK for DeepSeek/OpenRouter/Grok/MiniMax/Moonshot request paths, so the safer open-source fix is to import it only when needed and report a clear install error if that selected path is unavailable.
The runtime preflight handles the broader Python mismatch directly. It checks packages that are imported during backend startup, but intentionally leaves provider SDKs out so provider-specific SDK failures happen only when the selected provider path actually needs that SDK.
The service-manager changes keep
craftbot.py start,craftbot.py install, and the generated desktop shortcut honest: quick startup failures surface in the terminal with the last log lines, stale PID files are removed, failed CLI starts return nonzero, stale readiness markers are ignored, and the macOS source shortcut opens the existing local service only when both frontend and backend respond.Validation
Focused checks:
/usr/local/bin/python3.10 -m pytest tests/test_craftbot_service.py tests/test_model_factory.py tests/test_run_dependency_check.py -q-> 21 passed./usr/local/bin/python3.10 -m py_compile craftbot.py tests/test_craftbot_service.py agent_core/core/models/factory.py run.py app/main.py app/runtime_preflight.py tests/test_model_factory.py tests/test_run_dependency_check.pygit diff --checkLoop checks:
21 passedeach iteration).python3 run.py --cliwith Homebrew Python 3.14: 10 iterations, all exited nonzero with the missing-dependencies message and no traceback.python3 -m app.main --cliwith Homebrew Python 3.14: 10 iterations, all exited nonzero with the missing-dependencies message and no traceback.python3 craftbot.py startwith Homebrew Python 3.14: 10 iterations, all reportedCraftBot failed to start, exited nonzero, did not printCRAFTBOT STARTED, and did not leave a stale PID./usr/local/bin/python3.10.app.mainwithopenaiandanthropicdeliberately blocked: 10 iterations, all imported successfully.Manual smoke:
/usr/local/bin/python3.10 craftbot.py start --clistarted successfully,craftbot.py statusreported the PID running, and/usr/local/bin/python3.10 craftbot.py stopstopped it cleanly.Broader baseline:
/usr/local/bin/python3.10 -m pytest tests -q-> 107 passed, 4 unrelated failures intests/test_trigger_router_and_parking.py; the fake LLM there does not accept the newerprompt_namekeyword./usr/local/bin/python3.10 -m pytest -qstill fails during collection in generated/template living-ui test folders:app/data/living_ui_modules/auth/backend/tests/test_auth.pyimportsmodelsas a top-level module.app/data/living_ui_template/backend/testsconflicts with the roottests.conftestimport path.