vllm.model_executor.layers.quantization.utils.marlin_utils_test ¶
Utility functions used for tests and benchmarks
MarlinWorkspace ¶
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
__init__ ¶
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
awq_marlin_quantize ¶
awq_marlin_quantize(
w: Tensor,
quant_type: ScalarType,
group_size: int,
input_dtype: dtype | None = None,
)
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
get_weight_perm ¶
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
marlin_permute_weights ¶
marlin_permute_weights(
q_w,
size_k,
size_n,
perm,
tile=GPTQ_MARLIN_TILE,
is_a_8bit=False,
)
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
marlin_quantize ¶
marlin_quantize(
w: Tensor,
quant_type: ScalarType,
group_size: int,
act_order: bool,
test_perm: Tensor | None = None,
input_dtype: dtype | None = None,
)