背景
现象
原理
- 从Retrofit开始
  - Retrofit请求实例代码
  - GET请求
- 到OkHttp了
总结

背景

ios端在线上遇到了网络请求失败的问题，查到原因是客户的userId里面包含了符号”|”，在网路哦请求的时候需要这个参数了，ios没有对这个符号进行url encode，导致请求失败。

然后我测试和排查到安卓端没有这个问题，发现网络请求框架Retrofit+OkHttp自动对参数做了编码。

现象

传递进去的userId参数是：”哈哈|ABC”
然后编码出来的userId参数是：%E5%93%88%E5%93%88%7CABC

https://a.b.c/ef/v1/werqwe/csdfwef/xcvweg?userId=%E5%93%88%E5%93%88%7CABC&origin=android-SDK

原理

首先，这个url encode编码，也称为percent-encode，即百分号编码。关于编码原理，可以参考这篇：percent-encode 百分号编码。
这里能明显看得到，我们传递进去的参数，在框架内部自动做了url encode。
下文则开始分析网络请求框架中是在哪个地方做了这个编码的。

从Retrofit开始

Retrofit请求实例代码

 @GET("users/{user}/repos")
 suspend fun listReposKt(
     @Path("user") user: String,
     @Query("uid") uid: String
 ): List<GithubUserReposVO>

我们就看GET请求，这里分别用了注解：@GET、@Path、@Query
@GET表示这是get请求。
@Path用来拼接请求url的路径。
@Query用来设置请求的query参数。

我们测试的情况是对参数做了url encode，那么我们看query注解：

@Documented
@Target(PARAMETER)
@Retention(RUNTIME)
public @interface Query {
  /** The query parameter name. */
  String value();
  /**
   * Specifies whether the parameter {@linkplain #value() name} and value are already URL encoded.
   */
  boolean encoded() default false;
}

我们看到encode变量默认是false，表明这个变量默认是没有提前url编码的，那么后续会由框架内部进行url encode处理。
如果设置成了true，表明开发者已经提前做了自定义的url encode了，框架内部将不对这个参数做url encode处理。

实际上发现只要能够作为标志请求参数的注解，都有一个encoded()方法，包括了query, field, queryMap, fieldMap等。

GET请求

Retrofit内部应该是根据不同的请求类型去处理不同的注解参数的标记的。即只有遇到了GET注解，才会去处理query注解。根据这个猜想，我们看GET注解。（后面发现这个猜想是错的）

//RequestFactory.java
private void parseMethodAnnotation(Annotation annotation) {
  if (annotation instanceof DELETE) {
    parseHttpMethodAndPath("DELETE", ((DELETE) annotation).value(), false);
  } else if (annotation instanceof GET) {
    parseHttpMethodAndPath("GET", ((GET) annotation).value(), false);
  } else if (annotation instanceof HEAD) {
    parseHttpMethodAndPath("HEAD", ((HEAD) annotation).value(), false);
  } else if (annotation instanceof PATCH) {
    parseHttpMethodAndPath("PATCH", ((PATCH) annotation).value(), true);
  } else if (annotation instanceof POST) {
    parseHttpMethodAndPath("POST", ((POST) annotation).value(), true);
  } else if (annotation instanceof PUT) {
    parseHttpMethodAndPath("PUT", ((PUT) annotation).value(), true);
  } else if (annotation instanceof OPTIONS) {
    parseHttpMethodAndPath("OPTIONS", ((OPTIONS) annotation).value(), false);
  } else if (annotation instanceof HTTP) {
    HTTP http = (HTTP) annotation;
    parseHttpMethodAndPath(http.method(), http.path(), http.hasBody());
  } else if (annotation instanceof retrofit2.http.Headers) {
    String[] headersToParse = ((retrofit2.http.Headers) annotation).value();
    if (headersToParse.length == 0) {
      throw methodError(method, "@Headers annotation is empty.");
    }
    headers = parseHeaders(headersToParse);
  } else if (annotation instanceof Multipart) {
    if (isFormEncoded) {
      throw methodError(method, "Only one encoding annotation is allowed.");
    }
    isMultipart = true;
  } else if (annotation instanceof FormUrlEncoded) {
    if (isMultipart) {
      throw methodError(method, "Only one encoding annotation is allowed.");
    }
    isFormEncoded = true;
  }

看第7行，这里把注解的值传递进去。一般GET注解的值是拼接一个相对路径的，Retrofit的用法是一开始构造的时候传递一个baseUrl，然后再请求的时候拼接各种相对路径。
继续看：

//RequestFactory.java
private void parseHttpMethodAndPath(String httpMethod, String value, boolean hasBody) {
  if (this.httpMethod != null) {
    throw methodError(method, "Only one HTTP method is allowed. Found: %s and %s.",
        this.httpMethod, httpMethod);
  }
  this.httpMethod = httpMethod;
  this.hasBody = hasBody;
  if (value.isEmpty()) {
    return;
  }
  // Get the relative URL path and existing query string, if present.
  int question = value.indexOf('?');
  if (question != -1 && question < value.length() - 1) {
    // Ensure the query string does not have any named parameters.
    String queryParams = value.substring(question + 1);
    Matcher queryParamMatcher = PARAM_URL_REGEX.matcher(queryParams);
    if (queryParamMatcher.find()) {
      throw methodError(method, "URL query string \"%s\" must not have replace block. "
          + "For dynamic query parameters use @Query.", queryParams);
    }
  }
  this.relativeUrl = value;
  this.relativeUrlParamNames = parsePathParameters(value);
}

这里保存了相对路径到变量relativeUrl，然后也解析了相对路径中可能由符号’?’拼接的路径中的请求参数。把他们保存在relativeUrlParamNames容器中。

在这里没有找到query的影子，所以上述的猜想：Retrofit内部应该是根据不同的请求类型去处理不同的注解参数的标记的。即只有遇到了GET注解，才会去处理query注解。
是错误的。

一般带着问题找源码的时候很难去整个源码全局架构去分析，只能通过猜想和代码跳转，直接去看我们想要了解的那部分，这样无法站在全局、架构、设计的角度去吃透源码，但是比较方便快速定位问题的原理，比较节省时间，并且对具体问题的印象更深刻。

那么继续代码跳转query注解：

//RequestFactory.java
@Nullable
private ParameterHandler<?> parseParameterAnnotation(
    int p, Type type, Annotation[] annotations, Annotation annotation) {
    if (annotation instanceof Url) {
        //...
    }else if (annotation instanceof Path){
        //...
    }else if (annotation instanceof Query){
        validateResolvableType(p, type);
        Query query = (Query) annotation;
        String name = query.value();
        boolean encoded = query.encoded();
        Class<?> rawParameterType = Utils.getRawType(type);
        gotQuery = true;
        if (Iterable.class.isAssignableFrom(rawParameterType)) {
          if (!(type instanceof ParameterizedType)) {
            throw parameterError(method, p, rawParameterType.getSimpleName()
                + " must include generic type (e.g., "
                + rawParameterType.getSimpleName()
                + "<String>)");
          }
          ParameterizedType parameterizedType = (ParameterizedType) type;
          Type iterableType = Utils.getParameterUpperBound(0, parameterizedType);
          Converter<?, String> converter =
              retrofit.stringConverter(iterableType, annotations);
          return new ParameterHandler.Query<>(name, converter, encoded).iterable();
        } else if (rawParameterType.isArray()) {
          Class<?> arrayComponentType = boxIfPrimitive(rawParameterType.getComponentType());
          Converter<?, String> converter =
              retrofit.stringConverter(arrayComponentType, annotations);
          return new ParameterHandler.Query<>(name, converter, encoded).array();
        } else {
          Converter<?, String> converter =
              retrofit.stringConverter(type, annotations);
          return new ParameterHandler.Query<>(name, converter, encoded);
        }
    }else if (annotation instanceof QueryName){
        //...
    }else if(...){
        //...
    }
    //...
}

在parseParameterAnnotation函数中找到了处理query的逻辑，这是一个比较长的函数，达到了400多行。我们只看query的处理。
在14行提取了encoded变量，然后作为参数构造了对象：ParameterHandler.Query。

// ParameterHandler.java
static final class Query<T> extends ParameterHandler<T> {
    private final String name;
    private final Converter<T, String> valueConverter;
    private final boolean encoded;
    Query(String name, Converter<T, String> valueConverter, boolean encoded) {
        this.name = checkNotNull(name, "name == null");
        this.valueConverter = valueConverter;
        this.encoded = encoded;
    }
    @Override void apply(RequestBuilder builder, @Nullable T value) throws IOException {
        if (value == null) return; // Skip null values.
        String queryValue = valueConverter.convert(value);
        if (queryValue == null) return; // Skip converted but null values
        builder.addQueryParam(name, queryValue, encoded);
    }
}

encoded变量在19行，apply函数中调用，传递到builder.addQueryParam。builder是RequestBuilder。其实就是用来构建OkHttp的Request对象的。
看他的addQueryParam函数：

// RequestBuilder.java
void addQueryParam(String name, @Nullable String value, boolean encoded) {
    if (relativeUrl != null) {
        // Do a one-time combination of the built relative URL and the base URL.
        urlBuilder = baseUrl.newBuilder(relativeUrl);
        if (urlBuilder == null) {
            throw new IllegalArgumentException(
                "Malformed URL. Base: " + baseUrl + ", Relative: " + relativeUrl);
        }
        relativeUrl = null;
    }
    if (encoded) {
        //noinspection ConstantConditions Checked to be non-null by above 'if' block.
        urlBuilder.addEncodedQueryParameter(name, value);
    } else {
        //noinspection ConstantConditions Checked to be non-null by above 'if' block.
        urlBuilder.addQueryParameter(name, value);
    }
}

根据encoded变量，分别执行了urlBuilder的addEncodedQueryParameter和addQueryParameter方法。
urlBuilder是HttpUrl.Builder类对象，也就是用来构造url的类。
HttpUrl类型则来自OkHttp，我们需要看OkHttp的内容了。

到OkHttp了

分别看上述的两个方法定义：

//HttpUrl.Builder
/** Encodes the query parameter using UTF-8 and adds it to this URL's query string. */
fun addQueryParameter(name: String, value: String?) = apply {
    if (encodedQueryNamesAndValues == null) encodedQueryNamesAndValues = mutableListOf()
    encodedQueryNamesAndValues!!.add(name.canonicalize(
        encodeSet = QUERY_COMPONENT_ENCODE_SET,
        plusIsSpace = true
    ))
    encodedQueryNamesAndValues!!.add(value?.canonicalize(
        encodeSet = QUERY_COMPONENT_ENCODcanonicalE_SET,
        plusIsSpace = true
    ))
}
/** Adds the pre-encoded query parameter to this URL's query string. */
fun addEncodedQueryParameter(encodedName: String, encodedValue: String?) = apply {
    if (encodedQueryNamesAndValues == null) encodedQueryNamesAndValues = mutableListOf()
    encodedQueryNamesAndValues!!.add(encodedName.canonicalize(
        encodeSet = QUERY_COMPONENT_REENCODE_SET,
        alreadyEncoded = true,
        plusIsSpace = true
    ))
    encodedQueryNamesAndValues!!.add(encodedValue?.canonicalize(
        encodeSet = QUERY_COMPONENT_REENCODE_SET,
        alreadyEncoded = true,
        plusIsSpace = true
    ))
}

他的逻辑其实就是，向encodedQueryNamesAndValues容器中先添加canonicalize函数处理过的name，再添加canonicalize函数处理过的value。
canonical有规范化的意思，这里把参数的name和value规范化了，难道就是url encode了？继续看下

/**
 * Returns a substring of `input` on the range `[pos..limit)` with the following
 * transformations:
 *
 *  * Tabs, newlines, form feeds and carriage returns are skipped.
 *
 *  * In queries, ' ' is encoded to '+' and '+' is encoded to "%2B".
 *
 *  * Characters in `encodeSet` are percent-encoded.
 *
 *  * Control characters and non-ASCII characters are percent-encoded.
 *
 *  * All other characters are copied without transformation.
 *
 * @param alreadyEncoded true to leave '%' as-is; false to convert it to '%25'.
 * @param strict true to encode '%' if it is not the prefix of a valid percent encoding.
 * @param plusIsSpace true to encode '+' as "%2B" if it is not already encoded.
 * @param unicodeAllowed true to leave non-ASCII codepoint unencoded.
 * @param charset which charset to use, null equals UTF-8.
 */
internal fun String.canonicalize(
  pos: Int = 0,
  limit: Int = length,
  encodeSet: String,
  alreadyEncoded: Boolean = false,
  strict: Boolean = false,
  plusIsSpace: Boolean = false,
  unicodeAllowed: Boolean = false,
  charset: Charset? = null
): String {
  var codePoint: Int
  var i = pos
  while (i < limit) {
    codePoint = codePointAt(i)
    if (codePoint < 0x20 ||
        codePoint == 0x7f ||
        codePoint >= 0x80 && !unicodeAllowed ||
        codePoint.toChar() in encodeSet ||
        codePoint == '%'.toInt() &&
        (!alreadyEncoded || strict && !isPercentEncoded(i, limit)) ||
        codePoint == '+'.toInt() && plusIsSpace) {
      // Slow path: the character at i requires encoding!
      val out = Buffer()
      out.writeUtf8(this, pos, i)
      out.writeCanonicalized(
          input = this,
          pos = i,
          limit = limit,
          encodeSet = encodeSet,
          alreadyEncoded = alreadyEncoded,
          strict = strict,
          plusIsSpace = plusIsSpace,
          unicodeAllowed = unicodeAllowed,
          charset = charset
      )
      return out.readUtf8()
    }
    i += Character.charCount(codePoint)
  }
  // Fast path: no characters in [pos..limit) required encoding.
  return substring(pos, limit)
}

算是猜对了，这个函数做的事情就是url encode。
函数的具体算法就不看了，可以看到函数的参数有个alreadyEncoded: Boolean，即可以配置是不是已经编码过了。

前面看到所有的url encode后的参数都存在容器encodedQueryNamesAndValues里面，他是怎么被使用的呢？
HttpUrl.Builder是用来buildHttpUrl的，他会把query参数全部拼接好然后给到HttpUrl。

// HttpUrl.Builder
fun build(): HttpUrl {
    @Suppress("UNCHECKED_CAST") // percentDecode returns either List<String?> or List<String>.
    return HttpUrl(
        scheme = scheme ?: throw IllegalStateException("scheme == null"),
        username = encodedUsername.percentDecode(),
        password = encodedPassword.percentDecode(),
        host = host ?: throw IllegalStateException("host == null"),
        port = effectivePort(),
        pathSegments = encodedPathSegments.percentDecode() as List<String>,
        queryNamesAndValues = encodedQueryNamesAndValues?.percentDecode(plusIsSpace = true),
        fragment = encodedFragment?.percentDecode(),
        url = toString()
    )
}

看第13行的toString

override fun toString(): String {
  return buildString {
    if (scheme != null) {
      append(scheme)
      append("://")
    } else {
      append("//")
    }
    if (encodedUsername.isNotEmpty() || encodedPassword.isNotEmpty()) {
      append(encodedUsername)
      if (encodedPassword.isNotEmpty()) {
        append(':')
        append(encodedPassword)
      }
      append('@')
    }
    if (host != null) {
      if (':' in host!!) {
        // Host is an IPv6 address.
        append('[')
        append(host)
        append(']')
      } else {
        append(host)
      }
    }
    if (port != -1 || scheme != null) {
      val effectivePort = effectivePort()
      if (scheme == null || effectivePort != defaultPort(scheme!!)) {
        append(':')
        append(effectivePort)
      }
    }
    encodedPathSegments.toPathString(this)
    if (encodedQueryNamesAndValues != null) {
      append('?')
      encodedQueryNamesAndValues!!.toQueryString(this)
    }
    if (encodedFragment != null) {
      append('#')
      append(encodedFragment)
    }
  }
}

看35到38行，拼接”?”，然后拼接参数：

// HttpUrl.companion object
/** Returns a string for this list of query names and values. */
internal fun List<String?>.toQueryString(out: StringBuilder) {
    for (i in 0 until size step 2) {
        val name = this[i]
        val value = this[i + 1]
        if (i > 0) out.append('&')
        out.append(name)
        if (value != null) {
            out.append('=')
            out.append(value)
        }
    }
}

步长为2，将name和value依次拼接。
那这个HttpUrl构造给谁用呢？
内部和外部都在直接用。
内部给底层连接池用，上层拦截器用，外部给Retrofit等第三方库用，或者也可以直接面向客户使用。

总结

okhttp太牛掰了。

笔记和博客

Retrofit的请求为什么都自动url encode编码了

背景

现象

原理

从Retrofit开始

Retrofit请求实例代码

GET请求

到OkHttp了

总结