Skip to content

Public API

User facing functions supported by cuNumeric.jl

cuNumeric.versioninfo Method
julia
versioninfo()

Prints the cuNumeric build configuration summary, including package metadata, Julia and compiler version, and paths to core dependencies.

source
cuNumeric.disable_gc! Method
julia
disable_gc!()

Disables the automatic garbage collection heuristics. This gives the user full control over memory management.

source
cuNumeric.init_gc! Method
julia
init_gc!()

Initializes the cuNumeric garbage collector by querying the available device memory and enabling the automatic GC heuristics.

source
cuNumeric.@cunumeric Macro
julia
@cunumeric expr

Wraps a block of code so that all temporary NDArray allocations (e.g. from slicing or function calls) are tracked and safely freed at the end of the block. Ensures proper cleanup of GPU memory by inserting maybe_insert_delete calls automatically.

source
cuNumeric.allowpromotion Function
julia
allowpromotion([true])
allowpromotion([true]) do
    ...
end

Use this function to allow or disallow promotion to double precision, either globally or for the duration of the do block.

See also: @allowpromotion.

source
cuNumeric.allowscalar Function
julia
allowscalar([true])
allowscalar([true]) do
    ...
end

Use this function to allow or disallow scalar indexing, either globall or for the duration of the do block.

See also: @allowscalar.

source
cuNumeric.assertpromotion Method
julia
assertpromotion(op)

Assert that a certain operation op performs promotion to a wider type. If this is not allowed, an error will be thrown (assertpromotion).

source
cuNumeric.assertscalar Method
julia
assertscalar(op::String)

Assert that a certain operation op performs scalar indexing. If this is not allowed, an error will be thrown (allowscalar).

source
cuNumeric.@allowpromotion Macro
julia
@allowpromotion() begin
    # code that can use scalar indexing
end

Denote which operations can use scalar indexing.

See also: allowpromotion.

source
cuNumeric.@allowscalar Macro
julia
@allowscalar() begin
    # code that can use scalar indexing
end

Denote which operations can use scalar indexing.

See also: allowscalar.

source
cuNumeric.get_time_microseconds Method

Returns the timestamp in microseconds. Blocks on all Legate operations preceding the call to this function.

source
cuNumeric.get_time_nanoseconds Method

Returns the timestamp in nanoseconds. Blocks on all Legate operations preceding the call to this function.

source
Base.:== Method
julia
==(arr::NDArray, julia_arr::Array)
==(julia_arr::Array, arr::NDArray)

Compare an NDArray and a Julia Array for element-wise equality.

Returns true if both arrays have the same shape and all corresponding elements are equal. Returns false otherwise (including if sizes differ, with a warning).

Warning

This function uses scalar indexing and should not be used in production code. This is meant for testing.

Examples

julia
arr = cuNumeric.ones(2, 2)
julia_arr = ones(2, 2)
arr == julia_arr
julia_arr == arr
julia_arr2 = zeros(2, 2)
arr == julia_arr2
source
Base.:== Method
julia
==(arr1::NDArray, arr2::NDArray)

Check if two NDArrays are equal element-wise.

Returns true if both arrays have the same shape and all corresponding elements are equal. Currently supports arrays up to 3 dimensions. For higher dimensions, returns false with a warning.

Warning

This function uses scalar indexing and should not be used in production code. This is meant for testing.

Examples

julia
a = cuNumeric.ones(2, 2)
b = cuNumeric.ones(2, 2)
a == b
c = cuNumeric.zeros(2, 2)
a == c
source
Base.copy Method
julia
Base.copy(arr::NDArray)

Create and return a deep copy of the given NDArray.

Examples

julia
a = cuNumeric.ones(2, 2)
b = copy(a)
b === a
b[1,1] == a[1,1]
source
Base.copyto! Method
julia
copyto!(arr::NDArray, other::NDArray)

Assign the contents of other to arr element-wise.

This function overwrites the data in arr with the values from other. Both arrays must have the same shape.

Examples

julia
a = cuNumeric.zeros(2, 2)
b = cuNumeric.ones(2, 2)
copyto!(a, b);
a[1,1]
source
Base.eltype Method
julia
Base.eltype(arr::NDArray)

Returns the element type of the NDArray.

source
Base.firstindex Method
julia
Base.firstindex(arr::NDArray, dim::Int)
Base.lastindex(arr::NDArray, dim::Int)
Base.lastindex(arr::NDArray)

Provide the first and last valid indices along a given dimension dim for NDArray.

  • firstindex always returns 1, since Julia arrays are 1-indexed.

  • lastindex returns the size of the array along the specified dimension.

  • lastindex(arr) returns the size along the first dimension.

Examples

julia
arr = cuNumeric.rand(4, 5);
firstindex(arr, 2)
lastindex(arr, 2)
lastindex(arr)
source
Base.isapprox Method
julia
isapprox(arr1::NDArray, arr2::NDArray; atol=0, rtol=0)
isapprox(arr::NDArray, julia_array::AbstractArray; atol=0, rtol=0)
isapprox(julia_array::AbstractArray, arr::NDArray; atol=0, rtol=0)

Approximate equality comparison between two NDArrays or between an NDArray and a Julia AbstractArray.

Returns true if the arrays have the same shape and all corresponding elements are approximately equal within the given absolute tolerance atol and relative tolerance rtol.

The second and third methods handle comparisons between NDArray and Julia arrays by forwarding to a common comparison function.

Warning

This function uses scalar indexing and should not be used in production code. This is meant for testing.

Examples

julia
arr1 = cuNumeric.ones(2, 2)
arr2 = cuNumeric.ones(2, 2)
julia_arr = ones(2, 2)
isapprox(arr1, arr2)
isapprox(arr1, julia_arr)
isapprox(julia_arr, arr2)
source
Base.size Method
julia
Base.size(arr::NDArray)
Base.size(arr::NDArray, dim::Int)

Return the size of the given NDArray.

  • Base.size(arr) returns a tuple of dimensions of the array.

  • Base.size(arr, dim) returns the size of the array along the specified dimension dim.

These override Base's size methods for the NDArray type, using the underlying cuNumeric API to query array shape.

Examples

julia
arr = cuNumeric.rand(3, 4, 5);
size(arr)
size(arr, 2)
source
Random.rand! Method
julia
cuNumeric.rand!(arr::NDArray)

Fills arr with AbstractFloats uniformly at random.

cuNumeric.rand(NDArray, dims::Int...)
cuNumeric.rand(NDArray, dims::Tuple)

Create a new NDArray of element type Float64, filled with uniform random values.

This function uses the same signature as Base.rand with a custom backend, and currently supports only Float64 with uniform distribution (code = 0). In order to support other Floats, we type convert for the user automatically.

Examples

julia
cuNumeric.rand(NDArray, 2, 2)
cuNumeric.rand(NDArray, (4, 1))
A = cuNumeric.zeros(2, 2); cuNumeric.rand!(A)
source
cuNumeric.as_type Method
julia
as_type(arr::NDArray, t::Type{T}) where {T}

Convert the element type of arr to type T, returning a new NDArray with elements cast to T.

Arguments

  • arr::NDArray: Input array.

  • t::Type{T}: Target element type.

Returns

A new NDArray with the same shape as arr but with elements of type T.

Examples

julia
arr = cuNumeric.rand(4, 5);
as_type(arr, Float32)
source
cuNumeric.falses Method
julia
cuNumeric.falses(dims::Tuple, val)
cuNumeric.falses(dim::Int, val)
cuNumeric.falses(dims::Int...)

Create an NDArray filled with the false, with the shape specified by dims.

Examples

julia
cuNumeric.falses(2, 3)
source
cuNumeric.full Method
julia
cuNumeric.full(dims::Tuple, val)
cuNumeric.full(dim::Int, val)

Create an NDArray filled with the scalar value val, with the shape specified by dims.

Examples

julia
cuNumeric.full((2, 3), 7.5)
cuNumeric.full(4, 0)
source
cuNumeric.ones Method
julia
cuNumeric.ones([T=Float32,] dims::Int...)
cuNumeric.ones([T=Float32,] dims::Tuple)

Create an NDArray with element type T, of all zeros with size specified by dims. This function has the same signature as Base.ones, so be sure to call it as cuNuermic.ones.

Examples

julia
cuNumeric.ones(2, 2)
cuNumeric.ones(Float32, 3)
cuNumeric.ones(Int32, (2, 3))
source
cuNumeric.trues Method
julia
cuNumeric.trues(dims::Tuple, val)
cuNumeric.trues(dim::Int, val)
cuNumeric.trues(dims::Int...)

Create an NDArray filled with the true, with the shape specified by dims.

Examples

julia
cuNumeric.trues(2, 3)
source
cuNumeric.zeros Method
julia
cuNumeric.zeros([T=Float32,] dims::Int...)
cuNumeric.zeros([T=Float32,] dims::Tuple)

Create an NDArray with element type T, of all zeros with size specified by dims. This function mirrors the signature of Base.zeros, and defaults to Float32 when the type is omitted.

Examples

julia
cuNumeric.zeros(2, 2)
cuNumeric.zeros(Float64, 3)
cuNumeric.zeros(Int32, (2,3))
source
cuNumeric.floaty_unary_ops_no_args Constant

Supported Unary Operations

The following unary operations are supported and can be broadcast over NDArray:

  • - (negation)

  • ! (logical not)

  • abs

  • acos

  • acosh

  • asin

  • asinh

  • atan

  • atanh

  • cbrt

  • cos

  • cosh

  • deg2rad

  • exp

  • exp2

  • expm1

  • floor

  • isfinite

  • log

  • log10

  • log1p

  • log2

  • rad2deg

  • sign

  • signbit

  • sin

  • sinh

  • sqrt

  • tan

  • tanh

  • ^2

  • ^-1 or inv

Differences

  • The acosh function in Julia will error on inputs outside of the domain (x >= 1) but cuNumeric.jl will return NaN.

Examples

julia
A = cuNumeric.ones(Float32, 3, 3)

abs.(A)
log.(A .+ 1)
-sqrt.(abs.(A))
source
cuNumeric.unary_reduction_map Constant

Supported Unary Reduction Operations

The following unary reduction operations are supported and can be applied directly to NDArray values:

allanymaximumminimumprodsum

These operations follow standard Julia semantics.

Examples

julia
A = cuNumeric.ones(5)

maximum(A)
sum(A)
source
LinearAlgebra.mul! Method
julia
LinearAlgebra.mul!(out::NDArray, arr1::NDArray, arr2::NDArray)

Compute the matrix multiplication of arr1 and arr2, storing the result in out.

This function performs the operation in-place, modifying out.

Examples

julia
a = cuNumeric.ones(2, 3)
b = cuNumeric.ones(3, 2)
out = cuNumeric.zeros(2, 2)
LinearAlgebra.mul!(out, a, b)
source

CUDA.jl Tasking

This section will detail how to use custom CUDA.jl kernels with the Legate runtime.

cuNumeric.@cuda_task Macro
julia
@cuda_task(f(args...))

Compile a Julia GPU kernel to PTX, register it with the Legate runtime, and return a CUDATask object for later launch.

Arguments

  • f — The name of the Julia CUDA.jl GPU kernel function to compile.

  • args... — Example arguments to the kernel, used to determine the argument type signature when generating PTX.

Description

This macro automates the process of:

  1. Inferring the CUDA argument types for the given args using map_ndarray_cuda_types.

  2. Using CUDA.code_ptx to compile the specified GPU kernel (f) into raw PTX text for the inferred types.

  3. Extracting the kernel's function symbol name from the PTX using extract_kernel_name.

  4. Registering the compiled PTX and kernel name with the Legate runtime via ptx_task, making it available for GPU execution.

  5. Returning a CUDATask struct that stores the kernel name and type signature, which can be used to configure and launch the kernel later.

Notes

  • The args... are not executed; they are used solely for type inference.

  • This macro is intended for use with the Legate runtime and assumes a CUDA context is available.

  • Make sure your kernel code is GPU-compatible and does not rely on unsupported Julia features.

Example

julia
mytask = @cuda_task my_kernel(A, B, C)
source
cuNumeric.@launch Macro
julia
@launch(; task, blocks=(1,), threads=(256,), inputs=(), outputs=(), scalars=())

Launch a GPU kernel (previously registered via @cuda_task) through the Legate runtime.

Keywords

  • task — A CUDATask object, typically returned by @cuda_task.

  • blocks — Tuple or single element specifying the CUDA grid dimensions. Defaults to (1,).

  • threads — Tuple or single element specifying the CUDA block dimensions. Defaults to (256,).

  • inputs — Tuple or single element of input NDArray objects.

  • outputs — Tuple or single element of output NDArray objects.

  • scalars — Tuple or single element of scalar values.

Description

The @launch macro validates the provided keywords, ensuring only the allowed set (:task, :blocks, :threads, :inputs, :outputs, :scalars) are present. It then expands to a call to cuNumeric.launch, passing the given arguments to the Legate runtime for execution.

This macro is meant to provide a concise, declarative syntax for launching GPU kernels, separating kernel compilation (via @cuda_task) from execution configuration.

Notes

  • task must be a kernel registered with the runtime, usually from @cuda_task.

  • All keyword arguments must be specified as assignments, e.g. blocks=(2,2) not positional arguments.

  • Defaults are chosen for single-block, 256-thread 1D launches.

  • The macro escapes its body so that the values of inputs/outputs/scalars are captured from the surrounding scope at macro expansion time.

Example

julia
mytask = @cuda_task my_kernel(A, B, C)

@launch task=mytask blocks=(8,8) threads=(32,32) inputs=(A, B) outputs=(C)
source

CNPreferences

This section details how to set custom build configuration options. To see more details visit our install guide here.

CNPreferences.check_unchanged Method
julia
CNPreferences.check_unchanged()

Throws an error if the preferences have been modified in the current Julia session, or if they are modified after this function is called.

This is should be called from the __init__() function of any package which relies on the values of CNPreferences.

source
CNPreferences.use_conda Method
julia
CNPreferences.use_conda(conda_env::String; export_prefs = false, force = true)

Tells cuNumeric.jl to use existing conda install. We make no gurantees of compiler compatability at this time.

Expects conda_env to be the absolute path to the root of the environment. For example, /home/julialegate/.conda/envs/cunumeric-gpu

source
CNPreferences.use_developer_mode Method
julia
CNPreferences.use_developer_mode(; wrapper_branch="main", use_cupynumeric_jll=true, cupynumeric_path=nothing, export_prefs = false, force = true)

Tells cuNumeric.jl to enable developer mode. This will clone cunumeric_jl_wrapper into cuNumeric.jl/deps.

To specify a cunumeric_jl_wrapper branch: wrapper_branch="some-branch" To disable using cupynumeric_jll: use_cupynumeric_jll=false If you disable cupynumeric_jll, then you need to set a path to cuPyNumeric with cupynumeric_path="/path/to/cupynumeric"

source
CNPreferences.use_jll_binary Method
julia
CNPreferences.use_jll_binary(; export_prefs = false, force = true)

Tells Legate.jl to use JLLs. This is the default option.

source

Internal API

cuNumeric.NDArray Type

Internal API

The NDArray type represents a multi-dimensional array in cuNumeric. It is a wrapper around a Legate array and provides various methods for array manipulation and operations. Finalizer calls nda_destroy_array to clean up the underlying Legate array when the NDArray is garbage collected.

source
cuNumeric.LegateType Method
julia
LegateType(T::Type)

Internal API

Converts a Julia type T to the corresponding Legate type.

source
cuNumeric.compare Method
julia
compare(x, y, max_diff)

Internal API

Compare two arrays x and y for approximate equality within a maximum difference max_diff.

Supports comparisons between:

  • an NDArray and a Julia AbstractArray

  • two NDArrays

  • a Julia AbstractArray and an NDArray

Returns true if the arrays have the same shape and element type (for mixed types), and all corresponding elements differ by no more than max_diff.

Emits warnings when array sizes or element types differ.

Warning

This function uses scalar indexing and should not be used in production code. This is meant for testing.

Notes

  • This is an internal API used by higher-level approximate equality functions.

  • Does not support relative tolerance (rtol).

Behavior

  • Checks size compatibility.

  • Checks element type compatibility for NDArray vs Julia array.

  • Iterates over elements using CartesianIndices to compare element-wise difference.

source
cuNumeric.padded_shape Method
julia
padded_shape(arr::NDArray)

Internal API

Return the size of the given NDArray. This will include the padded size.

source
cuNumeric.shape Method
julia
shape(arr::NDArray)

Internal API

Return the size of the given NDArray.

source
cuNumeric.slice_array Method
julia
slice_array(slices::Vararg{Tuple{Union{Int,Nothing},Union{Int,Nothing}},N}) where {N}

Internal API

Constructs a vector of cuNumeric.Slice objects from a variable number of (start, stop) tuples.

Each tuple corresponds to a dimension slice, using slice internally.

source
cuNumeric.to_cpp_index Method
julia
to_cpp_index(d::Int64, ::Type{T}=UInt64)

Internal API

Converts a single Julia 1-based index d to a zero-based C++ style index wrapped in StdVector.

source
cuNumeric.to_cpp_index Method
julia
to_cpp_index(idx::Dims{N}, ::Type{T}=UInt64) where {N}

Internal API

Converts a Julia 1-based index tuple idx to a zero-based C++ style index wrapped in StdVector of the specified integer type.

Each element of idx is decremented by 1 to adjust from Julia’s 1-based indexing to C++ 0-based indexing.

source