Merge WEBRTC_ARCH_ARM64_NEON and WEBRTC_ARCH_ARM_NEON into one
WEBRTC_HAS_NEON.
Replace WEBRTC_DETECT_ARM_NEON by WEBRTC_DETECT_NEON.
Replace WEBRTC_ARCH_ARM by WEBRTC_ARCH_ARM64 for arm64 cpu.
BUG=4002
R=andrew@webrtc.org, jridges@masque.com, kjellander@webrtc.org
Change-Id: I870a4d0682b80633b671c9aab733153f6d95a980
Review URL: https://webrtc-codereview.appspot.com/49309004
Cr-Commit-Position: refs/heads/master@{#9228}
The macro is in C defined as
#define WEBRTC_SPL_MUL_16_16(a, b) \
((int32_t) (((int16_t)(a)) * ((int16_t)(b))))
(For definitions on ARMv7 and MIPS, see common_audio/signal_processing/include/spl_inl_{armv7,mips}.h)
Also includes
- style changes
- replaced pointer operations with direct element access
BUG=3348,3353
TESTED=locally on Linux and trybots
R=kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/48949004
Cr-Commit-Position: refs/heads/master@{#9075}
We currently hit asserts in AECM where the output of WebRtcSpl_NormW16() on armv7 is incorrect.
I've verified that it outputs -17 for negative values. Internally that means that clz returns 0 after a two's complement operation on a int16_t.
There is a mismatch between the int16_t input and otherwise 32 bit assumptions. Explicitly casting to int32_t makes the two's complement do the correct thing.
The CL also extends the unit tests by running through a larger set of values.
BUG=4486
TESTED=locally on Android Nexus 7 and trybots
R=aluebs@webrtc.org, kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/49549004
Cr-Commit-Position: refs/heads/master@{#8897}
The macro is defined as
#define WEBRTC_SPL_MUL_16_16_RSFT(a, b, c) \
(WEBRTC_SPL_MUL_16_16(a, b) >> (c))
where the latter macro is in C defined as
#define WEBRTC_SPL_MUL_16_16(a, b) \
((int32_t) (((int16_t)(a)) * ((int16_t)(b))))
(For definitions on ARMv7 and MIPS, see common_audio/signal_processing/include/spl_inl_{armv7,mips}.h)
The replacement consists of
- avoiding casts to int16_t if inputs already are int16_t
- adding explicit cast to <type> if result is assigned to <type> (other than int or int32_t)
- minor cleanups like remove of unnecessary parentheses and style changes
BUG=3348, 3353
TESTED=locally on Linux for both fixed and floating point and trybots
R=kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/49499004
Cr-Commit-Position: refs/heads/master@{#8853}
The modification only uses the unique part of the WebRtcSpl_MaxAbsValue
function. Pass Spltest.MinMaxOperationTest conformance test on both
ARMv7 and ARM64. And the single function performance is similar with
original assembly version on different platforms. If not specified, the
code is compiled by GCC 4.6. The result is the "X version / C version"
ratio, and the less is better.
| run 100k times | cortex-a7 | cortex-a15 |
| use C as the base on each | (1.2Ghz) | (1.7Ghz) |
| CPU target | | |
|----------------------------+-----------+------------|
| Neon asm | 32% | 15% |
| Neon intrinsics (GCC 4.6) | 36% | 37% |
| Neon intrinsics (GCC 4.8) | 35% | 18% |
BUG=3580
R=andrew@webrtc.org, jridges@masque.com
Change-Id: Ia2f6822ec58774b401cc440b6751a97e540b5048
Review URL: https://webrtc-codereview.appspot.com/30109004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@7803 4adac7df-926f-26a2-2b94-8c16560cd09d
Implemented the 3 bands splitting filter bank by:
1. Upsample by 4/3.
2. Split twice into 2 bands.
3. Discard upper most band, because it is empty anyway.
A unittest was also implemented:
1. Generate a signal from presence or absence of sine waves of different frequencies.
2. Split into 3 bands and check their presence or absence.
3. Recombine the bands.
4. Calculate delay (as it is an IIR it depends on frequency).
5. Check that the cross correlation of input and output is high enough at that delay.
BUG=webrtc:3146
R=andrew@webrtc.org, bjornv@webrtc.org, kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/31029004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@7754 4adac7df-926f-26a2-2b94-8c16560cd09d
The implementation of WEBRTC_SPL_RSHIFT_W16 is simply >>. This CL removes the macro usage in audio_processing and signal_processing.
Affected components:
* aecm
* agc
* nsx
Indirectly affecting (through signal_processing changes)
* codecs/cng
* codecs/isac/fix
* codecs/isac/main
BUG=3348,3353
TESTED=locally on Linux and trybots
R=kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/28699005
git-svn-id: http://webrtc.googlecode.com/svn/trunk@7432 4adac7df-926f-26a2-2b94-8c16560cd09d
This changes some method signatures to better reflect how callers are actually
using them. This also has the tendency to make signatures more consistent about
e.g. using int (instead of int16_t) for lengths of things like vectors, and
using int16_t (instead of int) for e.g. counts of bits in a value.
This also removes a couple of functions that were only called in unittests.
BUG=3353,chromium:81439
TEST=none
R=andrew@webrtc.org, bjornv@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/23389004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@7060 4adac7df-926f-26a2-2b94-8c16560cd09d
Macros should in general be avoided. WEBRTC_SPL_UMUL_32_16_RSFT16 is only used in iSAC fixed point as part of multiplying with LSB and MSB. A better approach is to have one function for that complete operation in iSAC.
This CL removes the macro and replace the operation locally.
BUG=3148, 3353
TESTED=locally on Linux and trybots
R=tina.legrand@webrtc.org, turaj@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/16349004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@6907 4adac7df-926f-26a2-2b94-8c16560cd09d
The macro is only used at four places in iSAC fixed point and the macro have been replaced at those places.
In addition, it is used in a unit test, but throws a warning treated as error (issue3674).
The macro has both MIPS and armv7 optimizations. Removing them impacts only MIPS platforms without DSP ASE. This may cause a very small increase in complexity when using iSAC fix.
The armv7 optimizations are not used anywhere, since specific ones are used inline in iSAC fix.
BUG=3348,3353,3674
TESTED=locally and trybots
R=ljubomir.papuga@gmail.com, tina.legrand@webrtc.org, turaj@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/16299004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@6871 4adac7df-926f-26a2-2b94-8c16560cd09d
In r6240 gcc was rolled from 4.6 to 4.8 changing the behavior on arm. The output of ComplexFFT differs causing both AECM and NS to perform worse. Looking at issues on gcc it says that there could be a memory shuffling/optimization despite using volatile affecting the output.
Splitting the three instructions in one call into two separate calls makes the compiler take proper actions resulting in correct outputs.
BUG=3370,3395
TESTED=trybots
R=kwiberg@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/21549004
git-svn-id: http://webrtc.googlecode.com/svn/trunk@6261 4adac7df-926f-26a2-2b94-8c16560cd09d