Abstract: Computer vision multimedia pipelines have become both more sophisticated and robust over the years. The pipelines can accept multiple inputs, perform frame analysis, and produce outputs on a variety of platforms with near-real-time performance. Vendors such as Nvidia have significantly grown their framework and library offerings while providing tutorials and documentation via online training and tutorials. Despite the prolific growth, many of the libraries, frameworks, and tutorials come with noticeable limitations. The limitations are especially apparent within the high-performance computing (HPC) environment where graphic processing units may be older, user-level rights more restricted, and access to the graphical user interface not always available. This work describes the process of building multimedia object detection and segmentation pipelines within the HPC environment, its challenges, and ways to overcome the shortcomings. The project describes an iterative design process, which can be used as a blueprint for future development of similar computer vision pipelines within the HPC hosting environment.