Android 音视频任务4

任务4. 学习 Android 平台的 MediaExtractor 和 MediaMuxer API，知道如何解析和封装 mp4 文件

MediaMuxer和MediaCodec算是比较年轻的，它们是JB 4.1和JB 4.3才引入的。前者用于将音频和视频进行混合生成多媒体文件。缺点是目前只能支持一个audio track和一个video track，目前支持mp4,3gp,webm输出
MediaCodec用于将音视频进行压缩编码，它有个比较牛X的地方是可以对Surface内容进行编码，如KK 4.4中屏幕录像功能就是用它实现的。
MediaFormat用于描述多媒体数据的格式。
MediaRecorder用于录像+压缩编码，生成编码好的文件如mp4, 3gpp，视频主要是用于录制Camera preview。
MediaPlayer用于播放压缩编码后的音视频文件。
AudioRecord用于录制PCM数据。
AudioTrack用于播放PCM数据。PCM即原始音频采样数据，可以用如vlc播放器播放。

MediaExtractor可以从数据源中提取解复用的，编码后的媒体数据。MediaExtractor用于音视频分路,比如从一个视频中分别提取音频(音轨)和视频(视频轨)
它既可以从视频里提取视频轨或音频轨，也可以单从音频里提取音频轨

//如下使用`

MediaExtractor extractor = new MediaExtractor();
 extractor.setDataSource(…);
int 
 int numTracks = extractor.getTrackCount(); //当前视频中有几个轨道
 for (int i = 0; i < numTracks; ++i) {
   MediaFormat format = extractor.getTrackFormat(i);
   String mime = format.getString(MediaFormat.KEY_MIME);
   boolean isVideoTrack = mime.startsWith("video/“); //当前是不是视频轨，视频轨的MIME以”video/“开头，音频轨的MIME以“audio/”开头
   if (isVideoTrack) {
     extractor.selectTrack(i);   //选择视轨,当确定感兴趣的轨道时，一定要选取！
     int index = mediaMuxer.addTrack(format) //如果要进行这个视轨的合成合成器除了需要数据，也需要mediaMuxer需要设置当前视轨的format，在写入的时候也需要视轨的index
   }
 }
 ByteBuffer inputBuffer = ByteBuffer.allocate(…)  //这个容量必须大一点，否则下面readSampleData会崩，实测是1000*1024可以
//下面readSampleData会自动更新buffer的limit和postion，就和read，write一样
 while (extractor.readSampleData(inputBuffer, ...) >= 0) {
   int trackIndex = extractor.getSampleTrackIndex();
   long presentationTimeUs = extractor.getSampleTime();
   ...
   extractor.advance(); //前进到下一个样本(下一个视频帧或音频帧)；readSampleData看起来不会自动更新读过的数据所以需要这个
 }

 extractor.release();
 extractor = null;

MediaMuxer可以复用基本流。目前MediaMuxer支持MP4，Webm和3GP文件作为输出。它还支持自Android Nougat以来在MP4中复用B帧。是extractor的反作用类型，用于把视频轨和音频轨进行合成,和MediaExtractor正好是反过程。
不支持mp3，wav音频源（AAC支持）
只能支持一个audio track和一个video track，

//如下使用
MediaMuxer muxer = new MediaMuxer("temp.mp4", OutputFormat.MUXER_OUTPUT_MPEG_4);
mediaMuxer.setOrientationHint(90); //设置混合后视频的旋转角度
 // More often, the MediaFormat will be retrieved from MediaCodec.getOutputFormat()
 // or MediaExtractor.getTrackFormat().
 MediaFormat audioFormat = new MediaFormat(...);
 MediaFormat videoFormat = new MediaFormat(...);
 int audioTrackIndex = muxer.addTrack(audioFormat); //返回的是混合器里的轨道号，也就是新文件里的轨道号
 int videoTrackIndex = muxer.addTrack(videoFormat);
 ByteBuffer inputBuffer = ByteBuffer.allocate(bufferSize);
 boolean finished = false;
 BufferInfo bufferInfo = new BufferInfo();

 muxer.start(); //开始混合
 while(!finished) {
   // getInputBuffer() will fill the inputBuffer with one frame of encoded
   /* sample from either MediaCodec or MediaExtractor, set isAudioSample to true when the sample is audio data, set up all the fields of bufferInfo,and return true if there are no more samples.*/
   finished = getInputBuffer(inputBuffer, isAudioSample, bufferInfo);
   if (!finished) {
     int currentTrackIndex = isAudioSample ? audioTrackIndex : videoTrackIndex;
/*
bufferInfo.presentationTimeUs = extractor.getSampleTime(); //直接从extractor获取
bufferInfo.offset = 0;  //如果没有特殊需求一般是0
bufferInfo.flags = extractor.getSampleFlags();
bufferInfo.size = size;  //本次写入的数据量
*/
     muxer.writeSampleData(currentTrackIndex, inputBuffer, bufferInfo); //每次写入数据都要同时写入index和info，info要明确如上面几点，注意这里的index是新合成的视频的相应轨道，应该是由addTrack返回的值
   }
 };
 muxer.stop();
 muxer.release();

元数据跟踪
每帧元数据用于携带与视频或音频相关的额外信息以便于离线处理，例如，来自传感器的陀螺仪信号可以在进行离线处理时帮助稳定视频。仅在MP4容器中支持元数据跟踪。添加新元数据轨道时，轨道的mime格式必须以前缀“application /”开头，例如“application/gyro”。元数据的格式/布局将由应用程序定义。写入元数据与编写视频/音频数据几乎相同，只是数据不会来自mediacodec。应用程序只需要将包含元数据的字节缓冲区以及相关的时间戳传递给writeSampleData（int，ByteBuffer，MediaCodec.BufferInfo）api。时间戳必须与视频和音频的时间基准相同。生成的MP4文件使用ISOBMFF的第12.3.3.2节中定义的TextMetaDataSampleEntry来表示元数据的mime格式。当使用MediaExtractor提取具有元数据轨道的文件时，元数据的mime格式将被提取到MediaFormat中。
//如下例，把陀螺仪信息也传给生成的MP4

MediaMuxer muxer = new MediaMuxer("temp.mp4", OutputFormat.MUXER_OUTPUT_MPEG_4);

   // SetUp Video/Audio Tracks.

   MediaFormat audioFormat = new MediaFormat(...);

   MediaFormat videoFormat = new MediaFormat(...);

   int audioTrackIndex = muxer.addTrack(audioFormat);

   int videoTrackIndex = muxer.addTrack(videoFormat);



   // Setup Metadata Track

   MediaFormat metadataFormat = new MediaFormat(...);

   metadataFormat.setString(KEY_MIME, "application/gyro");

   int metadataTrackIndex = muxer.addTrack(metadataFormat);



   muxer.start();

   while(..) {

       // Allocate bytebuffer and write gyro data(x,y,z) into it.

       ByteBuffer metaData = ByteBuffer.allocate(bufferSize);

       metaData.putFloat(x);

       metaData.putFloat(y);

       metaData.putFloat(z);

       BufferInfo metaInfo = new BufferInfo();

       // Associate this metadata with the video frame by setting

       // the same timestamp as the video frame.

       metaInfo.presentationTimeUs = currentVideoTrackTimeUs;

       metaInfo.offset = 0;

       metaInfo.flags = 0;

       metaInfo.size = bufferSize;

       muxer.writeSampleData(metadataTrackIndex, metaData, metaInfo);

   };

   muxer.stop();

   muxer.release();

 }

MediaCodec类可用于访问低级媒体编解码器，即编码器/解码器组件。它是Android低级多媒体支持基础架构的一部分（通常与MediaExtractor，MediaSync，MediaMuxer，MediaCrypto，MediaDrm，Image，Surface和AudioTrack一起使用。）

从广义上讲，编解码器处理输入数据以生成输出数据。它异步处理数据并使用一组输入和输出缓冲区。在简单的级别，您请求（或接收）一个空的输入缓冲区，用数据填充它并将其发送到编解码器进行处理。编解码器使用数据并将其转换到它的空输出缓冲区之一。最后，您请求（或接收）到一个填充了数据的输出缓冲区，使用其内容并将其释放回编解码器。

数据类型
编解码器对三种数据进行操作：压缩数据，原始音频数据和原始视频数据。可以使用ByteBuffers处理所有三种数据，但是您应该使用Surface for raw视频数据来提高编解码器性能。 Surface使用本机视频缓冲区而不映射或将它们复制到ByteBuffers;因此，效率更高。使用Surface时通常无法访问原始视频数据，但您可以使用ImageReader类访问不安全的解码（原始）视频帧。这可能仍然比使用ByteBuffers更有效，因为一些本机缓冲区可能会映射到直接ByteBuffers。使用ByteBuffer模式时，可以使用Image类和getInput / OutputImage（int）访问原始视频帧。

原始音频缓冲区
原始音频缓冲区包含整个PCM音频数据帧，这是通道顺序中每个通道的一个样本。每个样本都是本机字节顺序的16位有符号整数。

short[] getSamplesForChannel(MediaCodec codec, int bufferId, int channelIx) {

  ByteBuffer outputBuffer = codec.getOutputBuffer(bufferId);

  MediaFormat format = codec.getOutputFormat(bufferId);

  ShortBuffer samples = outputBuffer.order(ByteOrder.nativeOrder()).asShortBuffer();

  int numChannels = formet.getInteger(MediaFormat.KEY_CHANNEL_COUNT);

  if (channelIx < 0 || channelIx >= numChannels) {

    return null;

  }

  short[] res = new short[samples.remaining() / numChannels];

  for (int i = 0; i < res.length; ++i) {

    res[i] = samples.get(i * numChannels + channelIx);

  }

  return res;

}

//音频抽取后的format信息
0 = {HashMap$HashMapEntry@4594} "mime" -> "audio/mp4a-latm"
1 = {HashMap$HashMapEntry@4595} "aac-profile" -> "2"
2 = {HashMap$HashMapEntry@4596} "channel-count" -> "2"
3 = {HashMap$HashMapEntry@4597} "track-id" -> "1"
4 = {HashMap$HashMapEntry@4598} "durationUs" -> "192911760"
5 = {HashMap$HashMapEntry@4599} "csd-0" -> "java.nio.HeapByteBuffer[pos=0 lim=2 cap=2]"
6 = {HashMap$HashMapEntry@4600} "sample-rate" -> "44100"

//视频抽取后的format信息
0 = {HashMap$HashMapEntry@4620} "csd-1" -> "java.nio.HeapByteBuffer[pos=0 lim=8 cap=8]"
1 = {HashMap$HashMapEntry@4621} "rotation-degrees" -> "90"
2 = {HashMap$HashMapEntry@4622} "track-id" -> "1"
3 = {HashMap$HashMapEntry@4623} "height" -> "1200"
4 = {HashMap$HashMapEntry@4624} "profile" -> "1"
5 = {HashMap$HashMapEntry@4625} "color-standard" -> "1"
6 = {HashMap$HashMapEntry@4626} "durationUs" -> "2157877"
7 = {HashMap$HashMapEntry@4627} "color-transfer" -> "3"
8 = {HashMap$HashMapEntry@4628} "mime" -> "video/avc"
9 = {HashMap$HashMapEntry@4629} "frame-rate" -> "30"
10 = {HashMap$HashMapEntry@4630} "width" -> "1600"
11 = {HashMap$HashMapEntry@4631} "color-range" -> "2"
12 = {HashMap$HashMapEntry@4632} "max-input-size" -> "123106"
13 = {HashMap$HashMapEntry@4633} "csd-0" -> "java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]"
14 = {HashMap$HashMapEntry@4634} "level" -> "2048"

这里要注意的是，要分清楚原视频中的视/音轨号和新合成的视频中的视/音轨号，一般来说前者是为了让extractor选中相应的轨道，而后者是在合成视频写数据的时候需要。这里犯了一个错就是提取视频轨的时候，视频轨在视频中的轨道号是0，提取音频帧时，音轨在音频中的轨道号也是0，实际给muxer添加轨道的时候，视轨被添加到了新视频的轨道0，音轨被添加到了新视频的轨道1。但写音频数据的时候仍往0号轨道写，就崩掉了报错：stop muxer failed

具体工程代码见 https://github.com/lujianyun06/VATask/tree/master/app/src/main/java/com/example/lll/va/task4