Defold Goes Fully 3D: A Technical Breakdown and Architecture Tutorial
In a nutshell
Learn the new defold 3d game engine tutorial workflows with step-by-step code examples and 3D architecture best practices for indie devs.
Every indie developer knows the moment their game engine betrays them. You start with a lightweight 2D concept, scope creep introduces 3D elements, and suddenly your engine's build size explodes, your load times crawl, and your web builds crash out of memory. Massive engines like Unity and Unreal are phenomenal for AAA fidelity, but for solo developers aiming for frictionless, cross-platform distribution—especially WebGL and mobile—they often feel like driving a tank to the grocery store.
Enter Defold. Historically championed as an ultra-fast, zero-bloat 2D engine used by studios to hit every platform from HTML5 to Nintendo Switch with a single codebase, Defold has officially grown up. It is now a fully capable 3D game engine. While it always technically rendered in a 3D context under the hood (projecting flat planes via an orthographic camera), the recent updates have introduced dedicated 3D tooling, complete glTF mesh support, and a streamlined workflow for true 3D development.
This is a massive shift for the ecosystem. If you are looking for an engine that compiles in seconds, produces single-digit megabyte binaries, and still gives you the power to render dynamic 3D worlds, Defold is now a top-tier contender. In this defold 3d game engine tutorial and architectural breakdown, we are going to dive deep into how Defold's 3D rendering pipeline works, how to script a custom 3D camera from scratch, and the backend implications of networking a lightweight 3D game.
Under the Hood: The Reality of Defold's 3D Pipeline
To understand how to master Defold in 3D, you have to understand its rendering philosophy. Defold does not hand you a pre-configured PBR (Physically Based Rendering) pipeline out of the box like Unreal Engine does. Instead, it provides a highly optimized, data-driven render script written in Lua.
Everything drawn on screen in Defold is handled by a render_script. By default, this script is configured for 2D. It sets up an orthographic projection matrix, sorts sprites by their Z-value (depth), and draws them back-to-front. To unlock Defold's 3D capabilities, we must rewrite this script to utilize a perspective projection matrix, enable hardware depth testing, and define custom render predicates for our 3D models.
This low-level access is a double-edged sword. On one hand, you have to write a bit of matrix math. On the other hand, you have absolute control over your draw calls, allowing you to optimize rendering for ultra-low-end hardware in ways that monolithic engines make incredibly difficult.
Architecting Your Custom 3D Render Script
To render true 3D models without them overlapping incorrectly based on draw order, we need to enable the depth buffer (Z-buffer). Here is a foundational 3D render script that replaces Defold's default pipeline.
-- main/3d_pipeline.render_script
function init(self)
-- Define the background clear color (RGBA)
self.clear_color = vmath.vector4(0.1, 0.1, 0.12, 1.0)
self.clear_buffers = {
[render.BUFFER_COLOR_BIT] = self.clear_color,
[render.BUFFER_DEPTH_BIT] = 1.0,
[render.BUFFER_STENCIL_BIT] = 0
}
-- Create render predicates. 'model' is the default tag for 3D meshes
self.predicates = {
model = render.predicate({"model"}),
gui = render.predicate({"gui"}),
text = render.predicate({"text"})
}
end
function update(self)
-- 1. Setup the rendering state for the frame
render.set_depth_mask(true)
render.set_stencil_mask(0xff)
render.clear(self.clear_buffers)
-- 2. Configure the 3D Camera Projection
local window_width = render.get_window_width()
local window_height = render.get_window_height()
if window_width == 0 or window_height == 0 then return end
local aspect_ratio = window_width / window_height
local fov = math.rad(60) -- 60 degree Field of View
local near_z = 0.1
local far_z = 1000.0
local proj_matrix = vmath.matrix4_perspective(fov, aspect_ratio, near_z, far_z)
render.set_projection(proj_matrix)
-- 3. Draw 3D Models with Depth Testing enabled
render.set_depth_test(render.COMPARE_LEQUAL)
render.set_cull_face(render.FACE_BACK)
render.set_blend_func(render.BLEND_SRC_ALPHA, render.BLEND_ONE_MINUS_SRC_ALPHA)
-- The view matrix is passed via messages from our camera script
if self.view_matrix then
render.set_view(self.view_matrix)
render.draw(self.predicates.model)
end
-- 4. Draw GUI over the 3D scene (Orthographic)
render.set_depth_mask(false)
render.set_depth_test(render.COMPARE_ALWAYS)
local gui_proj = vmath.matrix4_orthographic(0, window_width, 0, window_height, -1, 1)
render.set_projection(gui_proj)
render.set_view(vmath.matrix4())
render.draw(self.predicates.gui)
render.draw(self.predicates.text)
end
function on_message(self, message_id, message)
if message_id == hash("set_view_matrix") then
self.view_matrix = message.matrix
end
end
Notice how explicitly we are managing state here. We clear the depth buffer, calculate the perspective matrix based on the current window dimensions, enforce back-face culling to save rasterization cycles, and finally, switch the projection matrix back to orthographic before rendering the GUI. This gives you a robust, split-pipeline rendering architecture.
Building a First-Person 3D Camera Controller
A render script alone will not show you much without a camera moving through the space. Defold operates heavily on a Message Passing architecture. Unlike object-oriented engines where the camera might directly call transform.Translate(), in Defold, our camera script will calculate its view matrix and dispatch it to the render script we just wrote.
Let's construct a standard First-Person camera that handles mouse-look (pitch and yaw) and keyboard movement (WASD).
-- scripts/camera_controller.script
go.property("mouse_sensitivity", 0.2)
go.property("move_speed", 10.0)
function init(self)
msg.post(".", "acquire_input_focus")
self.pitch = 0
self.yaw = 0
-- Hide and capture the mouse cursor
window.set_mouse_lock(true)
self.forward = vmath.vector3(0, 0, -1)
self.right = vmath.vector3(1, 0, 0)
self.up = vmath.vector3(0, 1, 0)
self.velocity = vmath.vector3(0)
end
function update(self, dt)
local pos = go.get_position()
-- Apply movement velocity
if vmath.length_sqr(self.velocity) > 0 then
local move_dir = vmath.normalize(self.velocity)
pos = pos + move_dir * self.move_speed * dt
go.set_position(pos)
end
-- Calculate the view matrix
local rotation = go.get_rotation()
self.forward = vmath.rotate(rotation, vmath.vector3(0, 0, -1))
local target = pos + self.forward
local view_matrix = vmath.matrix4_look_at(pos, target, self.up)
-- Send the calculated view matrix to the render pipeline
msg.post("@render:", "set_view_matrix", { matrix = view_matrix })
-- Reset velocity for the next frame
self.velocity = vmath.vector3(0)
end
function on_input(self, action_id, action)
if action_id == hash("mouse_moved") then
self.yaw = self.yaw - action.dx * self.mouse_sensitivity
self.pitch = self.pitch + action.dy * self.mouse_sensitivity
-- Clamp pitch to prevent the camera from flipping over
self.pitch = math.max(-89, math.min(89, self.pitch))
-- Convert Euler angles to a Quaternion
local rot_y = vmath.quat_rotation_y(math.rad(self.yaw))
local rot_x = vmath.quat_rotation_x(math.rad(self.pitch))
go.set_rotation(rot_y * rot_x)
elseif action_id == hash("move_forward") then
self.velocity = self.velocity + self.forward
elseif action_id == hash("move_backward") then
self.velocity = self.velocity - self.forward
elseif action_id == hash("move_left") then
self.right = vmath.cross(self.forward, self.up)
self.velocity = self.velocity - self.right
elseif action_id == hash("move_right") then
self.right = vmath.cross(self.forward, self.up)
self.velocity = self.velocity + self.right
end
end
This script captures mouse delta movements to adjust pitch and yaw, converting those Euler angles into a robust Quaternion rotation. It then derives the forward and right vectors directly from that rotation to ensure that pressing "W" always moves you in the direction you are currently looking.
3D Asset Integration and Custom Shaders
With Defold's recent updates, bringing 3D assets into your project is trivial. The engine natively supports the .gltf and .glb formats, which have become the industry standard for web and lightweight game development.
However, rendering a mesh requires a material, and materials require shaders. By default, Defold includes basic materials, but writing your own GLSL shaders gives you the visual distinctiveness necessary to stand out. Let's write a fast, unlit textured shader that is perfectly optimized for mobile or HTML5 targets.
The Vertex Shader (model.vp)
// model.vp
uniform highp mat4 view_proj;
uniform highp mat4 world;
attribute highp vec4 position;
attribute mediump vec2 texcoord0;
varying mediump vec2 var_texcoord0;
void main()
{
// Calculate the final screen-space position of the vertex
vec4 p = view_proj * world * vec4(position.xyz, 1.0);
var_texcoord0 = texcoord0;
gl_Position = p;
}
The Fragment Shader (model.fp)
// model.fp
varying mediump vec2 var_texcoord0;
uniform lowp sampler2D texture_sampler;
uniform lowp vec4 tint;
void main()
{
// Sample the texture and multiply by a color tint
lowp vec4 tex_color = texture2D(texture_sampler, var_texcoord0.xy);
gl_FragColor = tex_color * tint;
}
In your Defold material file, you map the view_proj uniform to the engine's built-in view-projection matrix, and texture_sampler to your mesh's diffuse texture. Because these shaders do not calculate dynamic lighting or shadow maps, they run incredibly fast, allowing you to easily maintain 60 FPS on low-end hardware.
Handling 3D Multiplayer and State Synchronization
When you transition from a 2D grid-based game to a full 3D environment, the complexity of your networking architecture increases exponentially. Moving from 2D to 3D means your state synchronization now has to account for Z-axis depth, floating-point inaccuracies across physics engines, and full quaternion rotations. If you handle this poorly, you will end up with severe visual stuttering—a common issue we explored when analyzing How To Fix Player Location Desync In Uefn And Unreal Engine Multiplayer.
Because Defold is a lightweight engine, it does not come with a heavy, built-in replication system like Unreal Engine's RPCs. You are responsible for packing your state data efficiently.
Relying on REST APIs to sync 3D positions will bottleneck your game loop instantly. Instead, you need persistent, bidirectional connections. While our previous guide covered how to Ditch Http Polling An Unreal Engine Websockets Tutorial For Real Time Backends, the exact same architectural principles apply directly to Defold's Lua-based WebSocket extensions.
Here is an example of how you should serialize 3D transform data to minimize payload size over WebSockets:
-- scripts/network_sync.lua
local json = require "builtins.scripts.json"
function serialize_transform(go_id)
local pos = go.get_position(go_id)
local rot = go.get_rotation(go_id)
-- Compress the payload to the absolute minimum required data
-- We round position to 2 decimal places to save bandwidth
local payload = {
id = hash_to_hex(go_id),
x = math.floor(pos.x * 100) / 100,
y = math.floor(pos.y * 100) / 100,
z = math.floor(pos.z * 100) / 100,
-- Quaternions require all 4 components for accurate reconstruction
qx = rot.x,
qy = rot.y,
qz = rot.z,
qw = rot.w
}
return json.encode(payload)
end
By rounding the floats and packing only essential data, you prevent your network buffers from overflowing during rapid movement updates. Managing this flow is the key to responsive cross-platform play.
The Infrastructure Burden of Cross-Platform 3D
Writing a highly optimized 3D game client in Defold is immensely satisfying. The engine stays out of your way, compiles in milliseconds, and lets you focus strictly on logic.
However, the moment you decide to make that 3D game multiplayer, add cloud saves, or implement cross-platform leaderboards, your focus immediately shifts away from game development and into server orchestration. You suddenly find yourself writing Dockerfiles, configuring Kubernetes clusters, wrestling with Redis instances for session state, and trying to secure your WebSocket gateways against DDoS attacks.
Building this yourself requires setting up load balancers, database sharding, and SSL cert management — easily 4-6 weeks of work just to get a reliable prototype running.
With horizOn, these backend services come pre-configured, letting you ship your game instead of your infrastructure. horizOn provides native integrations for user authentication, real-time database syncing, and server-authoritative logic, perfectly bridging the gap for lightweight engines like Defold that do not ship with proprietary backend ecosystems. You get to maintain the speed of a tiny client engine while outsourcing the heavy lifting of the server architecture.
Best Practices for Defold 3D Development
If you are planning to build a commercially viable 3D project in Defold, adhere strictly to these architectural guidelines:
- Keep Geometry Heavily Optimized: Defold is designed for speed. To maintain its lightweight advantage, keep your level geometry under 100,000 polygons total per scene, especially if you are targeting HTML5/WebGL. Use baked normal maps rather than high-density meshes to simulate detail.
- Leverage Render Predicates for Frustum Culling: Defold does not automatically cull objects that are outside the camera's view in 3D space out of the box. You must write custom frustum culling logic in Lua, dynamically disabling the model components of objects that are out of bounds to save rasterization time.
- Consolidate Draw Calls via Atlasing: Every unique material and texture requires a separate draw call sent to the GPU. Combine your textures into large texture atlases. If 10 different 3D models share the exact same material and atlas, Defold can batch them much more efficiently under the hood.
- Pre-calculate Complex Math: Matrix multiplications and quaternion conversions are highly expensive operations in Lua. Cache your forward and right vectors and only recalculate them when the player's rotation actually changes, rather than doing the heavy math unconditionally every single frame.
- Decouple Logic from Render Frequency: Your game logic (
update) might run at 60 FPS, but your custom physics or networking steps might tick at 30 FPS. Interpolate your 3D visual positions based on velocity rather than snapping them directly to the latest state to ensure buttery smooth rendering on varying monitor refresh rates. - Manage Lua Garbage Collection: In a dynamic 3D environment, you are frequently creating and destroying vector objects and matrices. Lua's garbage collector can cause noticeable frame spikes if left unmanaged. Reuse
vmath.vector3andvmath.matrix4instances whenever possible by updating their internal values directly, instead of instantiating new local variables inside yourupdateloop. Pre-allocate memory pools for bullets and entities. - Bake Your Lighting Externally: Because dynamic lighting in custom GLSL shaders will quickly eat into your performance budget on mobile devices, bake your global illumination and ambient occlusion directly into your textures using Blender or Maya before exporting your glTF models. A simple unlit shader with beautifully baked lighting will always outperform a complex dynamic shader on mobile web browsers.
Conclusion
The evolution of Defold into a robust 3D game engine is a massive win for independent developers. It successfully retains its lightning-fast compile times and incredibly tiny binary footprints while offering the raw mathematical foundations and tooling required to build expansive, engaging 3D worlds. By mastering custom render scripts, understanding matrix operations, and efficiently serializing your network data, you can build cross-platform titles that compete technically with much larger, bloat-heavy engines.
When you are ready to take your highly optimized 3D client online and scale your multiplayer backend without the headache of managing raw infrastructure, try horizOn for free or check out the API docs to see how quickly you can integrate real-time services into your next Defold project.