Performance Evaluation of SNPs Machine-Learning Workload on Intel® Pentium® Hyper-Threading Architectures

S. Ge, J. Song, C. Lai, E. Li, W. Hu (PRC), and X. Tian (USA)

Keywords

Threadlevel parallelism, OpenMP, hyperthreading, machine learning, performance evaluation, optimization.

Abstract

This paper analyzes a Pentium 4 hyper-threading processor and a Pentium 4 hyper-threading processor on 90nm technology with a machine learning workload parallelized with OpenMP* and Intel compiler. The focus is to understand SNPs performance and the underlying reasons behind that performance. The particular attention is paid to micro-architecture metrics and comparison to examine and evaluate, where appropriate, how those two types of processors perform relative to expectation on SNP machine learning workloads. Results include parallel speedup, micro-architecture metrics comparison.

Important Links:



Go Back