Frank Barchard 11dd1b956f ARGBToAR30 use vpmulhuw to replicate fields
AR30 is optimized with 3 techniques
1. vpmulhuw is used to replicate 8 bits to 10 bits.
2. Two channels are processed at a time.  R and B, and A and G.
3. vpshufb is used to shift and mask 2 channels of R and B

Red Blue
With the 8 bit value in the upper bits, vpmulhuw by (1024+4) will produce a 10
bit value in the low 10 bits of each 16 bit value. This is whats wanted for the
blue channel. The red needs to be shifted 4 left, so multiply by (1024+4)*16 for
red.

Alpha Green
Alpha and Green are already in the high bits so vpand can zero out the other
bits, keeping just 2 upper bits of alpha and 8 bit green. The same multiplier
could be used for Green - (1024+4) putting the 10 bit green in the lsb.  Alpha
would be a simple multiplier to shift it into position.  It wants a gap of 10
above the green.  Green is 10 bits, so there are 6 bits in the low short.  4
more are needed, so a multiplier of 4 gets the 2 bits into the upper 16 bits,
and then a shift of 4 is a multiply of 16, so (4*16) = 64.  Then shift the
result left 10 to position the A and G channels.

Bug: libyuv:751
Test: ARGBToAR30_Opt
Change-Id: Ie4f20dce18203bae7b75acb1fd5232db8a8a4f11
Reviewed-on: https://chromium-review.googlesource.com/820046
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-12-12 02:57:54 +00:00
..
compare_common.cc Fix odd length HammingDistance 2017-10-04 22:21:36 +00:00
compare_gcc.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
compare_msa.cc Remove DISABLE_CLANG_MSA 2017-11-08 19:55:14 +00:00
compare_neon64.cc fix clang-format-ing for row arm functions 2017-09-14 21:35:06 +00:00
compare_neon.cc fix clang-format-ing for row arm functions 2017-09-14 21:35:06 +00:00
compare_win.cc HammingDistance_X86 using popcnt assembly 2017-10-23 21:15:12 +00:00
compare.cc HammingDistance_X86 using popcnt assembly 2017-10-23 21:15:12 +00:00
convert_argb.cc Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 2017-12-09 00:11:20 +00:00
convert_from_argb.cc Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 2017-12-09 00:11:20 +00:00
convert_from.cc H420ToRAW and H420ToRGB24 added for bt.709 support. 2017-11-17 01:20:05 +00:00
convert_jpeg.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
convert_to_argb.cc add Intel Code Analyst markers 2017-01-13 15:50:24 -08:00
convert_to_i420.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
convert.cc lint cleanup for convert RGB24ToI420 2017-03-09 10:32:23 +00:00
cpu_id.cc casting for c89 compatibility and lint cleanup 2017-11-09 18:22:17 +00:00
mjpeg_decoder.cc Revert "include <new> header for benefit of new clang builds" 2017-08-03 22:03:47 +00:00
mjpeg_validate.cc casting for c89 compatibility and lint cleanup 2017-11-09 18:22:17 +00:00
planar_functions.cc SplitRGBPlane and MergeRGBPlane functions added 2017-09-11 21:02:04 +00:00
rotate_any.cc Add MSA optimized rotate functions (used 16x16 transpose) 2017-01-13 15:50:02 +05:30
rotate_argb.cc clang-format 5.0 applied to libyuv 2017-03-08 18:50:12 +00:00
rotate_common.cc clang-format 5.0 applied to libyuv 2017-03-08 18:50:12 +00:00
rotate_dspr2.cc clang-format 5.0 applied to libyuv 2017-03-08 18:50:12 +00:00
rotate_gcc.cc clang-format 5.0 applied to libyuv 2017-03-08 18:50:12 +00:00
rotate_msa.cc Add MSA optimized rotate functions (used 16x16 transpose) 2017-01-13 15:50:02 +05:30
rotate_neon64.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
rotate_neon.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
rotate_win.cc mingw fix ifdefs to use gcc source 2017-10-17 17:36:35 +00:00
rotate.cc Add MSA optimized SplitUV, Set, MirrorUV, SobelX and SobelY row functions. 2017-08-17 18:39:22 +00:00
row_any.cc Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 2017-12-09 00:11:20 +00:00
row_common.cc H010ToAR30 for 10 bit bt.709 YUV to 30 bit RGB 2017-11-22 23:58:30 +00:00
row_dspr2.cc mips switch sgtu to sltu for clang in ndk r14 2017-05-02 21:34:13 +00:00
row_gcc.cc ARGBToAR30 use vpmulhuw to replicate fields 2017-12-12 02:57:54 +00:00
row_msa.cc Remove DISABLE_CLANG_MSA 2017-11-08 19:55:14 +00:00
row_neon64.cc Port HammingDistance to SSSE3 2017-10-03 19:11:05 +00:00
row_neon.cc fix clang-format-ing for row arm functions 2017-09-14 21:35:06 +00:00
row_win.cc scale float samples and return max value 2017-08-04 23:34:30 +00:00
scale_any.cc Add MSA optimized ScaleFilterCols, ScaleARGBCols, ScaleARGBFilterCols and ScaleRowDown34 functions 2017-08-18 17:23:27 +00:00
scale_argb.cc Add MSA optimized ScaleFilterCols, ScaleARGBCols, ScaleARGBFilterCols and ScaleRowDown34 functions 2017-08-18 17:23:27 +00:00
scale_common.cc ScaleRowUp2_16_C port of NEON to C 2017-09-05 21:40:39 +00:00
scale_dspr2.cc Rename mips source files to dspr2. 2017-01-27 23:11:43 +00:00
scale_gcc.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
scale_msa.cc Remove DISABLE_CLANG_MSA 2017-11-08 19:55:14 +00:00
scale_neon64.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
scale_neon.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
scale_win.cc Mark a bunch of kArray variables as const. 2017-11-27 23:38:44 +00:00
scale.cc Fix for ScaleDownBy4_Linear_16 2017-11-15 23:05:22 +00:00
video_common.cc clang-format libyuv 2016-11-07 17:37:23 -08:00