ServerSocket 调用原理解析

Linux 下的 Socket 函数
Linux 下的 bind 函数
Linux 下的 ip 函数
Linux 下的 accept 函数
Linux 下的 read 函数
C 语言启动 socket 主要步骤
完整的 C 文件内容
JNI 调用步骤
include “WeServerSocket.h”

实验：通过 Java 模拟 ServerSocket 来调用 OS 底层的 Socket，实现网络通信
目的：了解什么是 natvie 方法和 JNI 调用步骤

Linux 下的 Socket 函数

man 2 socket
NAME
       socket - create an endpoint for communication
SYNOPSIS
       #include <sys/types.h>          /* See NOTES */
       #include <sys/socket.h>
       int socket(int domain, int type, int protocol);
DESCRIPTION
       socket()  creates  an  endpoint  for communication and returns a file descriptor that refers to that endpoint.
       The file descriptor returned by a successful call will be the lowest-numbered file  descriptor  not  currently
       open for the process.
       The  domain argument specifies a communication domain; this selects the protocol family which will be used for
       communication.  These families are defined in <sys/socket.h>.  The currently understood formats include:
       Name                Purpose                          Man page
       AF_UNIX, AF_LOCAL   Local communication              unix(7)
       AF_INET             IPv4 Internet protocols          ip(7)
       AF_INET6            IPv6 Internet protocols          ipv6(7)
       ... // 省略其他协议
       The socket has the indicated type, which specifies the communication semantics.  Currently defined types are:
       SOCK_STREAM     Provides sequenced, reliable, two-way, connection-based byte  streams.   An  out-of-band  data
                       transmission mechanism may be supported.
       SOCK_DGRAM      Supports datagrams (connectionless, unreliable messages of a fixed maximum length).
       ... // 省略其他类型
       The protocol specifies a particular protocol to be used with the socket.   Normally  only  a  single  protocol
       exists to support a particular socket type within a given protocol family, in which case protocol can be spec‐
       ified as 0.  However, it is possible that many protocols may exist, in which case a particular  protocol  must
       be  specified  in  this manner.  The protocol number to use is specific to the “communication domain” in which
       communication is to take place; see protocols(5).  See getprotoent(3) on how to map protocol name  strings  to
       protocol numbers.
RETURN VALUE
       On success, a file descriptor for the new socket is returned.  On error, -1 is  returned,  and  errno  is  set
       appropriately.

Linux 下的 bind 函数

我们可以看出来，如果想要 bind 某一个 socket，需要一个 socket 的文件描述，一个结构体 socketaddr 和地址程度，那么 struct sockaddr 又是在哪里呢？

man 2 bind
NAME
       bind - bind a name to a socket
SYNOPSIS
       #include <sys/types.h>          /* See NOTES */
       #include <sys/socket.h>
       int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
DESCRIPTION
       When  a  socket  is  created  with  socket(2),  it  exists in a name space (address family) but has no address
       assigned to it.  bind() assigns the address specified by addr to the socket referred to by the file descriptor
       sockfd.   addrlen  specifies  the size, in bytes, of the address structure pointed to by addr.  Traditionally,
       this operation is called “assigning a name to a socket”.
       It is normally necessary to assign a local address using bind() before a SOCK_STREAM socket may  receive  con‐
       nections (see accept(2)).
       The  rules  used  in  name binding vary between address families.  Consult the manual entries in Section 7 for
       detailed information.  For AF_INET, see ip(7); for AF_INET6,  see  ipv6(7);  for  AF_UNIX,  see  unix(7);  for
       AF_APPLETALK,  see  ddp(7);  for  AF_PACKET,  see  packet(7);  for AF_X25, see x25(7); and for AF_NETLINK, see
       netlink(7).
       The actual structure passed for the addr argument will depend on the address family.  The  sockaddr  structure
       is defined as something like:
           struct sockaddr {
               sa_family_t sa_family;
               char        sa_data[14];
           }
       The  only purpose of this structure is to cast the structure pointer passed in addr in order to avoid compiler
       warnings.  See EXAMPLE below.
RETURN VALUE
       On success, zero is returned.  On error, -1 is returned, and errno is set appropriately.

Linux 下的 ip 函数

这里我们可以看到 struct sockaddr_in 与 struct in_addr 结构体的描述

man 7 ip
NAME
       ip - Linux IPv4 protocol implementation
SYNOPSIS
       #include <sys/socket.h>
       #include <netinet/in.h>
       #include <netinet/ip.h> /* superset of previous */
       tcp_socket = socket(AF_INET, SOCK_STREAM, 0);
       udp_socket = socket(AF_INET, SOCK_DGRAM, 0);
       raw_socket = socket(AF_INET, SOCK_RAW, protocol);
DESCRIPTION
       Linux  implements  the Internet Protocol, version 4, described in RFC 791 and RFC 1122.  ip contains a level 2
       multicasting implementation conforming to RFC 1112.  It also contains an IP router including a packet filter.
       The programming interface is BSD-sockets compatible.  For more information on sockets, see socket(7).
       An IP socket is created using socket(2):
           socket(AF_INET, socket_type, protocol);
       Valid socket types are SOCK_STREAM to open a tcp(7) socket, SOCK_DGRAM to open a udp(7) socket, or SOCK_RAW to
       open  a  raw(7) socket to access the IP protocol directly.  protocol is the IP protocol in the IP header to be
       received or sent.  The only valid values for protocol are 0  and  IPPROTO_TCP  for  TCP  sockets,  and  0  and
       IPPROTO_UDP  for  UDP  sockets.   For  SOCK_RAW  you  may specify a valid IANA IP protocol defined in RFC 1700
       assigned numbers.
       When a process wants to receive new incoming packets or connections, it should bind a socket to a local inter‐
       face  address using bind(2).  In this case, only one IP socket may be bound to any given local (address, port)
       pair.  When INADDR_ANY is specified in the bind call, the socket will be bound to all local interfaces.   When
       listen(2)  is  called  on  an unbound socket, the socket is automatically bound to a random free port with the
       local address set to INADDR_ANY.  When connect(2) is called on an unbound socket, the socket is  automatically
       bound to a random free port or to a usable shared port with the local address set to INADDR_ANY.
       A  TCP  local  socket  address  that  has  been  bound  is unavailable for some time after closing, unless the
       SO_REUSEADDR flag has been set.  Care should be taken when using this flag as it makes TCP less reliable.
   Address format
       An IP socket address is defined as a combination of an IP interface address and a  16-bit  port  number.   The
       basic IP protocol does not supply port numbers, they are implemented by higher level protocols like udp(7) and
       tcp(7).  On raw sockets sin_port is set to the IP protocol.
           struct sockaddr_in {
               sa_family_t    sin_family; /* address family: AF_INET */
               in_port_t      sin_port;   /* port in network byte order */
               struct in_addr sin_addr;   /* internet address */
           };
           /* Internet address. */
           struct in_addr {
               uint32_t       s_addr;     /* address in network byte order */
           };
        sin_family is always set to AF_INET.  This is required; in Linux 2.2 most networking functions  return  EINVAL
       when  this setting is missing.  sin_port contains the port in network byte order.  The port numbers below 1024
       are called privileged ports (or sometimes: reserved ports).  Only a privileged process (on  Linux:  a  process
       that  has  the  CAP_NET_BIND_SERVICE  capability  in  the  user namespace governing its network namespace) may
           };
           /* Internet address. */
           struct in_addr {
               uint32_t       s_addr;     /* address in network byte order */
           };
       sin_family is always set to AF_INET.  This is required; in Linux 2.2 most networking functions  return  EINVAL
       when  this setting is missing.  sin_port contains the port in network byte order.  The port numbers below 1024
       are called privileged ports (or sometimes: reserved ports).  Only a privileged process (on  Linux:  a  process
       that  has  the  CAP_NET_BIND_SERVICE  capability  in  the  user namespace governing its network namespace) may
       bind(2) to these sockets.  Note that the raw IPv4 protocol as such has no concept of a port, they  are  imple‐
       mented only by higher protocols like tcp(7) and udp(7).
       sin_addr  is  the IP host address.  The s_addr member of struct in_addr contains the host interface address in
       network byte order.  in_addr should be assigned one of  the  INADDR_*  values  (e.g.,  INADDR_LOOPBACK)  using
       htonl(3)  or set using the inet_aton(3), inet_addr(3), inet_makeaddr(3) library functions or directly with the
       name resolver (see gethostbyname(3)).
       IPv4 addresses are divided into unicast, broadcast, and multicast addresses.  Unicast addresses specify a sin‐
       gle  interface  of a host, broadcast addresses specify all hosts on a network, and multicast addresses address
       all hosts in a multicast group.  Datagrams to broadcast addresses can  be  sent  or  received  only  when  the
       SO_BROADCAST  socket  flag  is set.  In the current implementation, connection-oriented sockets are allowed to
       use only unicast addresses.
       Note that the address and the port are always stored in network byte order.  In particular,  this  means  that
       you  need  to call htons(3) on the number that is assigned to a port.  All address/port manipulation functions
       in the standard library work in network byte order.
       There are several special addresses: INADDR_LOOPBACK (127.0.0.1) always refers to the local host via the loop‐
       back  device; INADDR_ANY (0.0.0.0) means any address for binding; INADDR_BROADCAST (255.255.255.255) means any
       host and has the same effect on bind as INADDR_ANY for historical reasons.

Linux 下的 accept 函数

man 2 accept
NAME
       accept, accept4 - accept a connection on a socket
SYNOPSIS
       #include <sys/types.h>          /* See NOTES */
       #include <sys/socket.h>
       int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
       #define _GNU_SOURCE             /* See feature_test_macros(7) */
       #include <sys/socket.h>
       int accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags);
DESCRIPTION
       The  accept()  system  call  is  used  with  connection-based  socket types (SOCK_STREAM, SOCK_SEQPACKET).  It
       extracts the first connection request on the queue of pending connections for the  listening  socket,  sockfd,
       creates a new connected socket, and returns a new file descriptor referring to that socket.  The newly created
       socket is not in the listening state.  The original socket sockfd is unaffected by this call.
       The argument sockfd is a socket that has been created with socket(2), bound to a local address  with  bind(2),
       and is listening for connections after a listen(2).
       The  argument  addr is a pointer to a sockaddr structure.  This structure is filled in with the address of the
       peer socket, as known to the communications layer.  The exact format of the address returned  addr  is  deter‐
       mined  by  the  socket's  address  family (see socket(2) and the respective protocol man pages).  When addr is
       NULL, nothing is filled in; in this case, addrlen is not used, and should also be NULL.
       The addrlen argument is a value-result argument: the caller must initialize it to contain the size (in  bytes)
       of the structure pointed to by addr; on return it will contain the actual size of the peer address.
       The  returned  address  is  truncated if the buffer provided is too small; in this case, addrlen will return a
       value greater than was supplied to the call.
       If no pending connections are present on the queue, and the socket is  not  marked  as  nonblocking,  accept()
       blocks  the  caller until a connection is present.  If the socket is marked nonblocking and no pending connec‐
       tions are present on the queue, accept() fails with the error EAGAIN or EWOULDBLOCK.
       In order to be notified of incoming connections on a socket, you can use select(2), poll(2), or  epoll(7).   A
       readable  event  will  be delivered when a new connection is attempted and you may then call accept() to get a
       socket for that connection.  Alternatively, you can set the socket to deliver SIGIO when activity occurs on  a
       socket; see socket(7) for details.
       If  flags  is 0, then accept4() is the same as accept().  The following values can be bitwise ORed in flags to
       obtain different behavior:
       SOCK_NONBLOCK   Set the O_NONBLOCK file status flag on the new open file description.  Using this  flag  saves
                       extra calls to fcntl(2) to achieve the same result.
       SOCK_CLOEXEC    Set  the  close-on-exec  (FD_CLOEXEC) flag on the new file descriptor.  See the description of
                       the O_CLOEXEC flag in open(2) for reasons why this may be useful.
RETURN VALUE
       On success, these system calls return a nonnegative integer that is a file descriptor for the accepted socket.
       On error, -1 is returned, and errno is set appropriately.

Linux 下的 read 函数

NAME
       read - read from a file descriptor
SYNOPSIS
       #include <unistd.h>
       ssize_t read(int fd, void *buf, size_t count);
DESCRIPTION
       read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.
       On  files that support seeking, the read operation commences at the file offset, and the file offset is incre‐
       mented by the number of bytes read.  If the file offset is at or past the end of file, no bytes are read,  and
       read() returns zero.
       If  count  is  zero, read() may detect the errors described below.  In the absence of any errors, or if read()
       does not check for errors, a read() with a count of 0 returns zero and has no other effects.
       According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES  for
       the upper limit on Linux.
RETURN VALUE
       On  success,  the  number  of  bytes  read  is returned (zero indicates end of file), and the file position is
       advanced by this number.  It is not an error if this number is smaller than the  number  of  bytes  requested;
       this  may happen for example because fewer bytes are actually available right now (maybe because we were close
       to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was  interrupted
       by a signal.  See also NOTES.
       On  error,  -1  is returned, and errno is set appropriately.  In this case, it is left unspecified whether the
       file position (if any) changes.

C 语言启动 socket 主要步骤

此时我们就可以用 C 语言来调用 Linux 下的 Socket 函数了

C 语言调用 socket 函数

// AF_INET: IPV4
// SOCK_STREAM: TCP/IP
// 0: 默认为 0
// 返回值 lfd: 返回一个文件描述符
int lfd = socket(AF_INET, SOCK_STREAM, 0);

C 语言调用 bind 函数

struct sockaddr_in my_addr; // 定义一个结构体
my_addr.sin_family = AF_INET; // IPV4 协议
my_addr.sin_port = htons(8080); // 整型变量从主机字节顺序转变成网络字节顺序
my_addr.sin_addr.s_addr = htonl(INADDR_ANY); // htonl：将主机数转换成无符号长整型的网络字节顺序
bind(lfd, (struct sockaddr*)&my_addr, sizeof(my_addr));
listen(lfd, 128); // 监听此 socket 在同一时刻的连接数（并发数）

调用完 bind 函数，理论上我们就可以进行 accept 等待线程接入了

C 语言调用 accept 函数

struct sockaddr_in client_addr;
char client_ip[INET_ADDRSTRLEN] = "";
socklen_t client_addr_len = sizeof(client_addr);
int connfd = accept(lfd, (struct sockaddr*)&client_addr, &client_addr_len);

当调用完 accept 函数后，就可以获取到客户端的文件描述符，然后调用 read 函数就可以获取客户端发送的数据

C 语言调用 read 函数

char recv_buf[512] = "";
while(1) {
    // ssize_t read(int fd, void *buf, size_t count);
    int length = read(connfd, recv_buf, sizeof(recv_buf)); // 读取流中的内容到缓冲区
    printf("recv data=%d\n", length);
    printf("%s\n", recv_buf);
}

完整的 C 文件内容

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/types.h>
#include "WeServerSocket.h"
JNIEXPORT void JNICALL Java_WeServerSocket_conn(JNIEnv *env, jobject cl, int jint) {
    // int socket(int domain, int type, int protocol);
    int lfd = socket(AF_INET, SOCK_STREAM, 0);
    // int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
    struct sockaddr_in my_addr; // 定义一个结构体
    my_addr.sin_family = AF_INET; // IPV4 协议
    my_addr.sin_port = htons(jint); // 整型变量从主机字节顺序转变成网络字节顺序
    my_addr.sin_addr.s_addr = htonl(INADDR_ANY); // htonl：将主机数转换成无符号长整型的网络字节顺序
    bind(lfd, (struct sockaddr*)&my_addr, sizeof(my_addr));
    // int listen(int sockfd, int backlog);
    listen(lfd, 128); // 监听此 socket 在同一时刻的连接数（并发数）
    printf("-listen client @port=%d...\n", jint);
    // int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
    struct sockaddr_in client_addr;
    char client_ip[INET_ADDRSTRLEN] = ""; // 定义一个 client_ip 数组，INET_ADDRSTRLEN = 16
    socklen_t client_addr_len = sizeof(client_addr);
    int connfd = accept(lfd, (struct sockaddr*)&client_addr, &client_addr_len); // 返回客户端 socket 的文件描述符
    // const char *inet_ntop(int af, const void *src, char *dst, socklen_t size);
    inet_ntop(AF_INET, &client_addr.sin_addr, client_ip, INET_ADDRSTRLEN); // 将客户端地址（网络字节）转化为数值，放到 client_ip 数组中
    printf("------------------------------------------\n");
    printf("client ip=%s, port=%d\n", client_ip, ntohs(client_addr.sin_port)); // ntohs：将一个16位数由网络字节顺序转换为主机字节顺序
    char recv_buf[512] = "";
    while(1) {
        // ssize_t read(int fd, void *buf, size_t count);
        int length = read(connfd, recv_buf, sizeof(recv_buf)); // 读取流中的内容到缓冲区
        printf("recv data=%d\n", length);
        printf("%s\n", recv_buf);
    }
    close(connfd);
    printf("client closed\n");
    close(lfd);
}

JNI 调用步骤

Java 调用 native 方法

public class WeServerSocket {
 static {
     System.loadLibrary("WeNativeNet");
 }
 public static void main(String[] args) throws IOException {
     WeServerSocket serverSocket = new WeServerSocket();
     serverSocket.conn(8080);
 }
 public native void conn(int port);
}

装载库，保证 JVM 在启动的时候就会装载所需要的库（so 文件），故而一般都是 static 方法
```
static {
 System.loadLibrary("WeNativeNet");
}
```
编译成 class 文件
```
javac WeServerSocket.java
```
生成 .h 的头文件
```
javah 包名.类名
javah WeServerSocket
```
将 .h 头文件包含到 c 文件中，并查看 .h 头文件中的方法签名信息，复制到 c 文件中 ```bash

include “WeServerSocket.h”

JNIEXPORT void JNICALL Java_WeServerSocket_conn(JNIEnv *env, jobject cl, int jint)


6. 将 C 文件编译动态连接库 .so 文件
- /usr/lib/jdk/jdk1.8.0_231/include 目录下包括 jni.h 文件
- /usr/lib/jdk/jdk1.8.0_231/include/linux 目录下包括 md_jni.h 文件
- 动态库前面必须是 lib 开头
```bash
gcc -fPIC -I /usr/lib/jdk/jdk1.8.0_231/include -I /usr/lib/jdk/jdk1.8.0_231/include/linux -shared -o lib动态库名.so C文件名.c -pthread

把这个动态连接库加入到 path 环境变量下（临时），如果需要的话可以加到 /etc/profile 下
```
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/
```
运行 class 文件 ```bash java WeServerSocket

-listen client @port=8080…


9. 用 nc 测试
```bash
nc 127.0.0.1 8080
hello server

至此完成了模拟 ServerSocket 启动 socket 服务端的所有流程，同时你也清楚了 natvie 方法和 JNI 调用方式