Efficient Mojo Tensor To Numpy Array Conversion

Hello, I am attempting to convert a Mojo Tensor to a Numpy array without looping or at least with sub-polynomial time. My approach is to get the pointer to the Tensor and use Python's ctypes.from_address() But it doesn't work - the output np_array is not the same as the tensor...maybe Tensor has some header bytes? Or the pointer does not point to a contiguous memory block? I would like to understand the issue with this code and get some inputs on how to achieve fast conversions for potentially large Mojo Tensors. Here is the code:
def main():
var ctypes = Python.import_module("ctypes")
var np = Python.import_module("numpy")

var mojo_tensor = Tensor[DType.int32](
TensorShape(4, 4),
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1,
)

var mojo_tensor_ptr = mojo_tensor.data().__int__()

var num_elements = mojo_tensor.num_elements().__int__()
var tensor_type = mojo_tensor.type


#Load the hopefully contiguous tensor into a ctypes array
var int32_array = (ctypes.c_int32 * num_elements).from_address(mojo_tensor_ptr.__int__())
var np_array = np.ctypeslib.as_array(int32_array).astype(np.int32)

print("Numpy array: ", np_array)
def main():
var ctypes = Python.import_module("ctypes")
var np = Python.import_module("numpy")

var mojo_tensor = Tensor[DType.int32](
TensorShape(4, 4),
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1,
)

var mojo_tensor_ptr = mojo_tensor.data().__int__()

var num_elements = mojo_tensor.num_elements().__int__()
var tensor_type = mojo_tensor.type


#Load the hopefully contiguous tensor into a ctypes array
var int32_array = (ctypes.c_int32 * num_elements).from_address(mojo_tensor_ptr.__int__())
var np_array = np.ctypeslib.as_array(int32_array).astype(np.int32)

print("Numpy array: ", np_array)
Thank you.
1 Reply
Sagi
Sagi3mo ago
Hi, this code worked for me although it has some problem, for some reason, the first two elements (at least from my very little tests) of the pointer are wrong, you can also confirm that by printing the pointer values in mojo, after that i just casted the pointer to a numpy array with ctypes and numpy
from tensor import Tensor, TensorShape
from python import Python

fn main() raises:

var np = Python.import_module("numpy")
var ctypes = Python.import_module("ctypes")
var mojo_tensor = Tensor[DType.int32](
TensorShape(5, 5),
1, 0, 0, 0, 0,
0, 1, 0, 0, 0,
0, 0, 1, 0, 0,
0, 0, 0, 1, 0,
0, 0, 0, 0, 1
)
var ptr = mojo_tensor.data().__int__()
var data_pointer = ctypes.cast(ptr, ctypes.POINTER(ctypes.c_int32))
var numpy_array = np.ctypeslib.as_array(data_pointer, shape=(5, 5))
print(numpy_array)

from tensor import Tensor, TensorShape
from python import Python

fn main() raises:

var np = Python.import_module("numpy")
var ctypes = Python.import_module("ctypes")
var mojo_tensor = Tensor[DType.int32](
TensorShape(5, 5),
1, 0, 0, 0, 0,
0, 1, 0, 0, 0,
0, 0, 1, 0, 0,
0, 0, 0, 1, 0,
0, 0, 0, 0, 1
)
var ptr = mojo_tensor.data().__int__()
var data_pointer = ctypes.cast(ptr, ctypes.POINTER(ctypes.c_int32))
var numpy_array = np.ctypeslib.as_array(data_pointer, shape=(5, 5))
print(numpy_array)