AI News
Real Time

Shanghai Jiao Tong University Open-Sources SkVM: A Language Virtual Machine for Skills Enabling "Wri

While Skills are powerful, their performance often falters due to incompatibilities with different models and Agent Harnesses. Not all models can effe...
While Skills are powerful, their performance often falters due to incompatibilities with different models and Agent Harnesses. Not all models can effectively utilize every Skill; in some cases, implementation can even degrade performance.
To address this challenge, the IPADS research team at Shanghai Jiao Tong University has introduced SkVM: a Language Virtual Machine designed specifically for Skills. In the era of AI Agents, Skills function as code, while different large language models (LLMs) ACT as heterogeneous processors. Drawing inspiration from classic language virtual machines like the Java Virtual Machine (JVM), the team has designed the first native virtual machine for Skills. SkVM allows a Skill to be written once and run efficiently across any model and Agent Harness. Skills compiled with SkVM can enable smaller models (e.g., 30B parameters) to achieve accuracy comparable to Opus 4.6, while reducing Token Consumption by 40% and boosting execution speed by up to 50 times. This Technology provides a one-CLIck enhancement to the execution speed, Token Efficiency, and task accuracy of mAInstream Agent Frameworks like openclaw, Hermes, openJiuwen, and PI, as well as the broader Skill ecosystem such as ClawHub.
The Mismatch Between Skills and Models
The execution of the Same Skill can vary Dramatically across different model and harness combinations, sometimes even hindering performance. Researchers from Shanghai Jiao Tong University's IPADS lab analyzed over 118,000 Skills and found that:
  • 15% of tasks saw a performance decrease after using a Skill.

  • 87% of tasks showed no improvement on at least one model.

  • Some Skills caused token costs to skyrocket by 451% with no corresponding increase in success rate.

The reason is straightforward: Skills are written as "natural language code," but models and runtime environments differ vastly, creating a significant semantic gap between a Skill's requirements and the capabilities provided by the model and environment.
  • Model Capability Mismatch: A Skill may assume a highly capable model, but a smaller model might not underStand the instructions, leading to a performance drop in 15% of tasks when forced.

  • Environment Dependencies Cause Errors: If a Skill requires a Python package that isn't installed on the user's machine, the LLM is forced into a cycle of trial and error, wasting a large number of tokens.

  • Slow and Costly: For highly repetitive and rigid tasks, the LLM must re-run its "inference-tool calling" loop each time, resulting in extremely high token costs.

SkVM: Write Once, Run Efficiently Anywhere!
To tackle these pain points, the Shanghai Jiao Tong University team drew inspiration from traditional language virtual machine designs to create SkVM, a virtual machine architecture for natural language.
SkVM's design is analogous to the classic JVM. It abstracts the underlying runtime, incorporates Ahead-of-Time (AOT) and Just-in-Time (JIT) compilation, and performs adaptive optimization and scheduling at runtime.
AOT Compilation (Ahead-of-Time Compilation)
This process compiles Skills into a format that is more comprehensible to models. During Skill installation, the AOT compiler (composed of a compilation-optimized Skill and an LLM) generates multiple compiled artifacts. These artifacts help the Agent Harness and LLM better understand the Skill during runtime. Before execution, SkVM performs three key steps:
  • PASS-1: Capability-Based Compilation System: The system identifies 26 "Primitive Capabilities" (e.g., tool invocation, instruction following, format alignment) and benchmarks the LLM's proficiency in each, similar to a CPU benchmark. Unlike other LLM test suites, these capabilities test for basic, orthogonal, composable, and logic-agnostic skills rather than complex problem-solving. Each capability is scored to generate an objective capability profile for the LLM+Harness combination. The compiler then analyzes the Skill to determine its required primitive capabilities and levels. If a Skill's requirements exceed the LLM+Harness's capabilities, the compiler optimizes the Skill to lower its demands. For example, if a Skill uses relative paths for preDeFined scripts and the LLM+Harness lacks the ability to parse them, the compiler converts them to absolute paths during Installation, reducing the required level for the "script execution" capability.

  • PASS-2: Environment Binding: Skills often define required environments and dependencies. During runtime, the LLM typically checks for and installs these, leading to significant token waste or installation failures. The AOT compiler automatically extracts the necessary packages and tools, generating installation/verification scripts. This allows for one-click environment setup before execution, eliminating the need for the LLM to troubleshoot.

  • PASS-3: Concurrency Extraction: Over 76% of Skills contain workflows, which Agent harnesses typically execute serially by default. AOT compilation can uncover parallelization opportunities at different granularities within a Skill's execution, including data parallelism (one instruction, multiple data), instruction parallelism (parallel execution of independent instructions), and thread parallelism (multiple independent sub-agents handling different subtasks). It then generates a parallelizable DAG (Directed Acyclic Graph) workflow. Developers can also register custom compilation optimization mechanisms with the AOT compiler for further pre-runtime optimization.

Runtime Optimization: More Accurate and Efficient with Every Run
Beyond static compilation, SkVM employs Just-in-Time (JIT) compilation at runtime to accelerate Skill execution.
  • Code Solidification: Scripts defined in a Skill are often code templates with variable parameters. Each time the Skill runs, the LLM must regenerate the executable script, wasting a large number of tokens. To counter this, during the AOT phase, SkVM generates a code fingerprint, template, and corresponding parameter list. During runtime, the code generated by the LLM is matched against the pre-generated AOT code fingerprint. If a match is successful multiple times consecutively, SkVM uses JIT Compilation to solidify the executable code based on the input parameters, rather than having the LLM regenerate it each time.

  • Adaptive Recompilation: If an error or retry occurs during runtime, the system collects error logs and feeds them back to the compiler for automatic re-optimization of the Skill. This prevents the same errors from recurring and improves the overall task success rate.

During runtime, in addition to JIT compilation, SkVM manages the Skill lifecycle and loading, ensuring that new compiled artifacts are correctly loaded and executed. It also adjusts the degree of parallelism based on current system resources to minimize unnecessary resource contention.
Experimental Results: Small Models with SkVM Rival Opus 4.6, Efficiency Boosts Up to 50x
The research team tested SkVM on 118 representative tasks, including code generation and data analysis. The results showed significant benefits, especially for weaker, smaller models, as it compensates for their shortcomings in handling complex JSON structures, environment dependencies, and script parsing. This enables a Qwen 30B model to achieve task success rates comparable to Opus 4.6. For top-tier models, using SkVM-compiled Skills reduced token consumption by up to 40%.
Thanks to the "Code Solidification" technology, the execution time for code sections was compressed from over ten thousand milliseconds to just a few hundred, representing a 19x to 50x speed increase. By leverAGIng data, instruction, and thread parallelism, SkVM improved the overall execution efficiency of Skills by up to 3.2 times.
Currently, SkVM can be seamlessly integrated into mainstream Agent frameworks such as OpenClaw, Hermes Agent, openJiuwen Agent, and PI Agent, and supports the broader Skill ecosystem, including Clawhub.


★★★★★
★★★★★
Be the first to rate this article.

Comments & Questions (0)

Captcha
Please be respectful — let's keep the conversation friendly.

No comments yet

Be the first to comment!