解決不小心把 debian package source 加入 ubuntu ,更新套件後出現 lib 錯誤、sudo 失效、docker無法連線...等問題

問題1:

執行

sudo apt-get update && sudo apt-get upgrade

之後,出現

Errors were encountered while processing:
 /var/cache/apt/archives/libgstreamer-plugins-bad1.0-0_1.18.3-1+b1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

叫你執行

sudo apt --fix-broken install

結果還是沒用

解決辦法:

參考 Apt-get upgrade blocked by gstreamer1.0-plugins-bad - PureOS - Purism community

先到 /etc/apt/sources.list 中移除 debian 的 source

# deb http://ftp.de.debian.org/debian sid main

執行

sudo apt clean && sudo apt update && sudo apt full-upgrade

參考 https://blog.csdn.net/xiaosun26/article/details/8019326

移除套件

sudo dpkg -P --force-all libgstreamer-plugins-bad1.0-0

問題2:

執行

sudo add-apt-repository ppa:minos-archive/main

出現

AttributeError: 'Thread' object has no attribute 'isAlive'

解決辦法:

根據 AttributeError: 'Thread' object has no attribute 'isAlive' · Issue #51 · jupyter-vim/jupyter-vim

這是由於 python 3.9 將 thread's method isAlive() 改名為 is_alive()。因此,只要將 python 3.9 移除,保留python 3.8 即可:

參考 16.04 - Uninstall python 3.6 - Ask Ubuntu

 sudo apt-get remove --purge python3.9

問題3:

執行

sudo add-apt-repository ppa:minos-archive/main

出現

sudo: add-apt-repository: command not found

解決辦法:

參考 How to fix sudo: add-apt-repository: command not found error on Linux Ubuntu – Linux Hint

sudo apt install software-properties-common

問題4:

執行

sudo apt install software-properties-common

後,結果出現

The following packages have unmet dependencies:
 python3 : Depends: libpython3-stdlib (= 3.8.2-0ubuntu2) but 3.9.2-3 is to be installed
E: Unable to correct problems, you have held broken packages.

解決辦法:

執行

sudo apt list | grep libpython3-stdlib

得到

libpython3-stdlib/now 3.9.2-3 amd64 [installed,local]
libpython3-stdlib/focal 3.8.2-0ubuntu2 i386

其中第一個是 installed 的版本,可能是來自 debian 的較新版,而非ubuntu。因此,要重新安裝成 focal 的版本

參考 How to Install Specific Version of Package using apt-get

sudo apt install libpython3-stdlib=3.8.2-0ubuntu2

問題5:

sudo add-apt-repository ppa:minos-archive/main

出現

Traceback (most recent call last):
  File "/usr/bin/add-apt-repository", line 108, in <module>
    sp = SoftwareProperties(options=options)
  File "/usr/lib/python3/dist-packages/softwareproperties/SoftwareProperties.py", line 118, in __init__
    self.reload_sourceslist()
  File "/usr/lib/python3/dist-packages/softwareproperties/SoftwareProperties.py", line 613, in reload_sourceslist
    self.distro.get_sources(self.sourceslist)    
  File "/usr/lib/python3/dist-packages/aptsources/distro.py", line 91, in get_sources
    raise NoDistroTemplateException(
aptsources.distro.NoDistroTemplateException: Error: could not find a distribution template for Debian/bullseye

參考 Server running Ubuntu is suddenly Debian bullseye?

執行

lsb_release -a
LSB Version:	security-11.1.0ubuntu2-noarch
Distributor ID:	Debian
Description:	Debian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye
vi /etc/os-release
PRETTY_NAME="Debian GNU/Linux bullseye/sid"
NAME="Debian GNU/Linux"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

可以發現 OS name被改了,理論上改回來就好了

方法一:

參考 How to revert damage done by editing /etc/os-release - Ask Ubuntu

sudo vi /etc/os-release

改成以下內容

NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

但事實上你不應該用方法一直接修改 /etc/os-release 檔案,而是重新安裝一次 base-files

方法二:

參考 Apt "could not find a distribution template" error - Ask Ubuntu

https://launchpad.net/ubuntu/+source/base-files 下載對應版本 base-files_*.deb

執行

sudo dpkg -i /your/path/to/base-files_*.deb

再執行以下指令測試

sudo apt-get install --reinstall base-files

用了方法一跟方法二,執行:

sudo add-apt-repository ppa:minos-archive/main

結果還是出現

Traceback (most recent call last):
  File "/usr/bin/add-apt-repository", line 108, in <module>
    sp = SoftwareProperties(options=options)
  File "/usr/lib/python3/dist-packages/softwareproperties/SoftwareProperties.py", line 118, in __init__
    self.reload_sourceslist()
  File "/usr/lib/python3/dist-packages/softwareproperties/SoftwareProperties.py", line 613, in reload_sourceslist
    self.distro.get_sources(self.sourceslist)    
  File "/usr/lib/python3/dist-packages/aptsources/distro.py", line 91, in get_sources
    raise NoDistroTemplateException(
aptsources.distro.NoDistroTemplateException: Error: could not find a distribution template for Ubuntu/focal

解決辦法:

原來是因為安裝到 debian 版的 python-apt-common 套件:

sudo apt list | grep python-apt-common
python-apt-common/now 2.1.7 all [installed,local]

參考 使用add-apt-repository时提示,could not find a distribution template for Deepin/stable, Debian Bug report logs - #513039
aborts with "could not find a distribution template"
, apt - Problem with software-properties-gtk in Ubuntu 18.04 LTS - Ask Ubuntu , package management - Should I use "apt-get remove" or "apt-get purge"? - Unix & Linux Stack Exchange

這個套件會在 /usr/share/python-apt/templates 建立許多 template 檔案,例如 Ubuntu.info,然後從這些 template 抓取 ppa 的路徑

先移除 python-apt-common

sudo apt purge python-apt-common

移除後,會發現 /usr/share/python-apt/templates 空了,並且執行

sudo add-apt-repository ppa:minos-archive/main

會出現

sudo: add-apt-repository: command not found

表示連 add-apt-repository 都被移除了

參考 20.04 - add-apt-repository  always throws an error - Ask Ubuntu

接下來重新安裝 software-properties-common

sudo apt install --reinstall software-properties-common

由前面的問題可知,這樣做應該會把 add-apt-repository 再裝回來

查看

sudo apt list | grep python-apt-common
python-apt-common/focal-updates,focal-updates,now 2.0.0ubuntu0.20.04.4 all [installed,automatic]

發現已經變成 ubuntu 版的 python-apt-common

再次執行

sudo add-apt-repository ppa:minos-archive/main

不再出現錯誤訊息 :)

問題6:

sudo apt update

出現

Ign:14 http://ppa.launchpad.net/minos-archive/main/ubuntu hirsute InRelease
Err:15 http://ppa.launchpad.net/minos-archive/main/ubuntu hirsute Release
  404  Not Found [IP: 91.189.95.85 80]
Reading package lists... Done
E: The repository 'http://ppa.launchpad.net/minos-archive/main/ubuntu hirsute Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

解決辦法:

移除無效的 source

sudo vi /etc/apt/sources.list.d/minos-archive-ubuntu-main-hirsute.list

將內容註解掉

#deb http://ppa.launchpad.net/minos-archive/main/ubuntu hirsute main
# deb-src http://ppa.launchpad.net/minos-archive/main/ubuntu hirsute main

問題7:

檢查 /var/log/auth.log

vi /var/log/auth.log

發現

PAM unable to dlopen(pam_captcha.so): /lib/x86_64-linux-gnu/security/pam_captcha.so: cannot open shared object file: No such file or directory
Apr 15 23:02:20 server sshd[136014]: PAM adding faulty module: pam_captcha.so

也許是 libpam-capcha 沒有安裝好。根據 https://askubuntu.com/questions/1125440/finding-pam-module

pam module 應該要安裝在以下資料夾的 security 子資料夾

/lib/i386-linux-gnu
/lib/x86_64-linux-gnu
/usr/lib/i386-linux-gnu
/usr/lib/x86_64-linux-gnu

後來發現 pam_captcha.so 被安裝在  /lib/security 中,似乎是該套件的預設行為。但應該是要可以找到的路徑,卻找不到,因此懷疑是 ssh 版本的問題

sudo apt list | grep openssh
openssh-client-ssh1/focal 1:7.5p1-11build1 amd64
openssh-client/now 1:8.4p1-5 amd64 [installed,local]
openssh-client/focal-updates,focal-security 1:8.2p1-4ubuntu0.2 i386

...
openssh-server/now 1:8.4p1-5 amd64 [installed,local]
openssh-server/focal-updates,focal-security 1:8.2p1-4ubuntu0.2 i386

可以發現 openssh client server 都安裝到 debian 版,重新安裝:

sudo apt install openssh-client=1:8.2p1-4ubuntu0.2
sudo apt install openssh-server=1:8.2p1-4ubuntu0.2

不過還是一樣找不到...

也有可能是 pam 版本問題

sudo apt list | grep pam
libpam-modules-bin/now 1.4.0-7 amd64 [installed,local]
libpam-modules-bin/focal-updates 1.3.1-5ubuntu4.1 i386
libpam-modules/now 1.4.0-7 amd64 [installed,local]
libpam-modules/focal-updates 1.3.1-5ubuntu4.1 i386

但是發現重新安裝會產生危險訊息

sudo apt install libpam-modules-bin=1.3.1-5ubuntu4.1
WARNING: The following essential packages will be removed.
This should NOT be done unless you know exactly what you are doing!
  apt adduser (due to apt) init systemd-sysv (due to init) login libpam-runtime (due to login) libpam-modules (due to login) mount
  util-linux (due to mount)

解決辦法:

直接在 /etc/pam.d/sshd 的 module 名稱給出明確路徑,參考 ubuntu - why is my pam_userdb.so missing? - Stack Overflow , https://manpages.ubuntu.com/manpages/bionic/man5/pam.conf.5.html

sudo vi /etc/pam.d/sshd
#Autogenerated by libpam-captcha
auth requisite /lib/security/pam_captcha.so math randomstring

問題8:

執行

sudo service ssh restart

結果出現

Failed to allocate directory watch: Too many open files

解決辦法:

參考 Failed to allocate directory watch: Too many open files

打開 /etc/sysctl.conf 加入以下設定

vi /etc/sysctl.conf
# sysctl fs.inotify
# 2^10 = 1024
fs.inotify.max_user_instances=1024 
# 2^18 = 262144
fs.inotify.max_user_watches=262144 

執行

sudo sysctl -p

分析:

可能是 inotify 數量限制了,修改後服務可以正常啟動

查看 inotify 參數:

sudo sysctl fs.inotify

問題9:

想要直接停止 docker deamon,結果發生錯誤無法停止:

sudo systemctl stop docker.service
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket

解法:

參考 Starting Docker Daemon on Demand with Socket Activation - blog.dbrgn.ch

先停止 docker.socket,再停止 docker.service

sudo systemctl stop docker.socket 
sudo systemctl stop docker.service

查看 status

sudo systemctl status docker.service

問題10:

docker container 無法對外連線,顯示出各種錯誤訊息

例如執行

docker run --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

會產生錯誤訊息

ping: Lacking privilege for raw socket.

加上 --privileged 就可以正常執行

docker run --rm --privileged -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

另外像是參考 Docker Port Binding explained | Better Programming

執行

docker container run -p 8080:80 -d nginx

執行完馬上就終止了, docker ps 查看沒有任何容器在運作

而加上 --privileged 執行

docker container run --privileged -p 8080:80 -d nginx

就可以從 docker ps 中看到該容器正在運作,執行:

curl -I 0.0.0.0:8080
HTTP/1.1 200 OK
Server: nginx/1.19.10
Date: Sat, 17 Apr 2021 12:48:19 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 13 Apr 2021 15:13:59 GMT
Connection: keep-alive
ETag: "6075b537-264"
Accept-Ranges: bytes

另外

docker run --rm --dns 8.8.8.8 --dns 8.8.4.4 alpine ifconfig

會得到

ifconfig: socket(AF_INET,2,0): Permission denied

嘗試關閉 appamor

根據 https://github.com/rancher/rke/issues/1722
https://linuxconfig.org/how-to-disable-apparmor-on-ubuntu-20-04-focal-fossa-linux

檢查 status

sudo apparmor_status

關閉 apparmor

sudo systemctl disable apparmor

然後重開機

reboot

還是不行...

嘗試清除 iptables

參考 https://stackoverflow.com/questions/20430371/my-docker-container-has-no-internet

https://github.com/moby/moby/issues/866#issuecomment-19218300

# 安裝 brctl
sudo apt-get install bridge-utils
sudo systemctl stop docker.socket
sudo systemctl stop docker.service
sudo pkill docker
sudo iptables -F
sudo ifconfig docker0 down
sudo brctl delbr docker0
sudo systemctl start docker.service

brctl delbr docker0ip link del docker0 刪除 docker bridge

可用 ifconfig 查看是否有 docker bridge

還是不行...

嘗試安裝 snap 版

先徹底移除套件

sudo apt-get purge docker-ce docker-ce-cli containerd.io

安裝 snap 版

snap install docker

執行

docker ps

出現

WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/json: dial unix /var/run/docker.sock: socket: permission denied

還是不行...

嘗試重設所有套件

根據 Ubuntu – Dpkg-reconfigure and how is it different from dpkg –configure – iTecTec

From man dpkg

   --configure package...|-a|--pending
          Configure a package which has been unpacked but not yet  config‐
          ured.   If  -a  or  --pending  is  given instead of package, all
          unpacked but unconfigured packages are configured.

          Configuring consists of the following steps:

          1.  Unpack  the  conffiles, and at the same time back up the old
          conffiles, so that they can be restored if something goes wrong.

          2. Run postinst script, if provided by the package.

From man dpkg-reconfigure

   dpkg-reconfigure - reconfigure an already installed package

   -pvalue, --priority=value
       Specify the minimum priority of question that will be displayed.
       dpkg-reconfigure normally shows low priority questions no matter
       what your default priority is. See debconf(7) for a list.

   -a, --all
       Reconfigure all installed packages that use debconf. Warning: this
       may take a long time.

Here dpkg --configure -a will configure all unpacked but unconfigured packages. whereas dpkg-reconfigure -phigh -a will reconfigure all installed packages that use debconf with high priority.

How to Start CentOS GUI From the Command Line

本來要執行

sudo dpkg-reconfigure -phigh -a

重設所有套件

結果出現

Unknown option: a

根據 Bug #1463672 “-a and -all swith for dpkg-reconfigure do not work...” : Bugs : debconf package : Ubuntu

這個選項被移除了,只好改成列出所有清單再一個個重設

for i in `dpkg --get-selections | grep "install" | cut -f1` ; do sudo dpkg-reconfigure --priority high $i ; done 

還是不行...

嘗試改變權限

安裝 podman 作為對照:

. /etc/os-release
echo "deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_${VERSION_ID}/ /" | sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list
curl -L "https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_${VERSION_ID}/Release.key" | sudo apt-key add -
sudo apt-get update
sudo apt-get -y upgrade
sudo apt-get -y install podman

執行

podman run --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

結果一樣是

ping: Lacking privilege for raw socket.

嘗試加上 --privileged

podman run --privileged --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

結果:

PING google.com (172.217.24.14): 56 data bytes
^C--- google.com ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

反而出現無法接收?

正在覺得奇怪的同時,嘗試加上sudo執行:

sudo podman run --privileged --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

結果竟然就可以了:

PING google.com (172.217.160.78): 56 data bytes
64 bytes from 172.217.160.78: icmp_seq=0 ttl=118 time=3.335 ms
64 bytes from 172.217.160.78: icmp_seq=1 ttl=118 time=3.416 ms
64 bytes from 172.217.160.78: icmp_seq=2 ttl=118 time=3.305 ms
64 bytes from 172.217.160.78: icmp_seq=3 ttl=118 time=3.246 ms
64 bytes from 172.217.160.78: icmp_seq=4 ttl=118 time=3.297 ms
--- google.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 3.246/3.320/3.416/0.056 ms

所以查詢了一下 podman 的資料,

根據 non-root podman not able to ping external services (limited net capability) · Issue #2488 · containers/podman

https://github.com/containers/podman/blob/main/troubleshooting.md#5-rootless-containers-cannot-ping-hosts

提到因為 podman 是 rootless containers,所以:

It is most likely necessary to enable unprivileged pings on the host. Be sure the UID of the user is part of the range in the /proc/sys/net/ipv4/ping_group_range file.

To change its value you can use something like: sysctl -w "net.ipv4.ping_group_range=0 2000000".

To make the change persistent, you'll need to add a file in /etc/sysctl.d that contains net.ipv4.ping_group_range=0 $MAX_UID.

執行

sysctl -w "net.ipv4.ping_group_range=0 2000000"

再執行

podman run --privileged --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

結果就可以了:

PING google.com (142.251.43.14): 56 data bytes
64 bytes from 142.251.43.14: icmp_seq=0 ttl=255 time=3.454 ms
64 bytes from 142.251.43.14: icmp_seq=1 ttl=255 time=3.593 ms
64 bytes from 142.251.43.14: icmp_seq=2 ttl=255 time=3.467 ms
64 bytes from 142.251.43.14: icmp_seq=3 ttl=255 time=3.379 ms
64 bytes from 142.251.43.14: icmp_seq=4 ttl=255 time=3.589 ms
--- google.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 3.379/3.496/3.593/0.083 ms

這樣也可以

podman run --cap-add=NET_RAW --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com

但是加上sudo反而不行?

sudo podman run --cap-add=NET_RAW --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com
ping: Lacking privilege for raw socket.

因此懷疑是sudo有問題

嘗試重新安裝sudo

找到 setuid root executables lose sticky bit · Issue #802 · concourse/concourse

提到:

Running stat $(which sudo) after hijacking in shows 0777 permissions. Manually running chmod u+s $(which sudo) in our task script fixed the issue.

執行

chmod u+s $(which sudo)

結果還是不行...

執行 apt list | grep sudo 得到:

sudo/now 1.9.5p2-3 amd64 [installed,local]
sudo/focal-updates,focal-security 1.8.31-1ubuntu1.2 i386

顯然這個 sudo 用的是 debian 版

改回ubuntu版:

sudo apt install sudo=1.8.31-1ubuntu1.2

還是不行...

嘗試重新安裝 apparmor

後來參考這篇 Host [x.x.x.x] is not able to connect to the following ports: [nc: socket(AF_INET,1,0): Permission denied · Issue #1722 · rancher/rke

先查看一下 apparmor 的狀態

sudo aa-status
apparmor module is loaded.
41 profiles are loaded.
40 profiles are in enforce mode.
   /snap/snapd/12704/usr/lib/snapd/snap-confine
   /snap/snapd/12704/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
   /snap/snapd/12883/usr/lib/snapd/snap-confine
   /snap/snapd/12883/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
   /usr/bin/evince
   /usr/bin/evince-previewer
   /usr/bin/evince-previewer//sanitized_helper
   /usr/bin/evince-thumbnailer
   /usr/bin/evince//sanitized_helper
   /usr/bin/man
   /usr/lib/NetworkManager/nm-dhcp-client.action
   /usr/lib/NetworkManager/nm-dhcp-helper
   /usr/lib/connman/scripts/dhclient-script
   /usr/lib/cups/backend/cups-pdf
   /usr/lib/snapd/snap-confine
   /usr/lib/snapd/snap-confine//mount-namespace-capture-helper
   /usr/sbin/cups-browsed
   /usr/sbin/cupsd
   /usr/sbin/cupsd//third_party
   /{,usr/}sbin/dhclient
   containers-default-0.38.16
   cri-containerd.apparmor.d
   docker-default
   ippusbxd
   lsb_release
   man_filter
   man_groff
   nvidia_modprobe
   nvidia_modprobe//kmod
   snap-update-ns.chromium
   snap-update-ns.helm
   snap-update-ns.snap-store
   snap.chromium.chromedriver
   snap.chromium.chromium
   snap.chromium.hook.configure
   snap.snap-store.hook.configure
   snap.snap-store.snap-store
   snap.snap-store.ubuntu-software
   snap.snap-store.ubuntu-software-local-file
   tcpdump
1 profiles are in complain mode.
   snap.helm.helm
13 processes have profiles defined.
13 processes are in enforce mode.
   /usr/sbin/cups-browsed (908) 
   /usr/sbin/cupsd (881) 
   /usr/sbin/lighttpd (1994) cri-containerd.apparmor.d
   /metrics-server (2053) cri-containerd.apparmor.d
   /usr/bin/local-path-provisioner (2434) cri-containerd.apparmor.d
   /coredns (2435) cri-containerd.apparmor.d
   /traefik (2719) cri-containerd.apparmor.d
   /usr/bin/busybox (2849) cri-containerd.apparmor.d
   /usr/bin/busybox (2893) cri-containerd.apparmor.d
   /usr/sbin/mysqld (2936) cri-containerd.apparmor.d
   /app/cmd/webhook/webhook.runfiles/com_github_jetstack_cert_manager/cmd/webhook/webhook_/webhook (3460) cri-containerd.apparmor.d
   /app/cmd/cainjector/cainjector.runfiles/com_github_jetstack_cert_manager/cmd/cainjector/cainjector_/cainjector (3461) cri-containerd.apparmor.d
   /app/cmd/controller/controller.runfiles/com_github_jetstack_cert_manager/cmd/controller/controller_/controller (3462) cri-containerd.apparmor.d
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.

執行 apt list | grep apparmor 得到:

apparmor/now 2.13.3-7 amd64 [installed,local]
apparmor/focal-updates,now 2.13.3-7ubuntu5.1 amd64

可以發現 apparmor 用的是 debian 版

改回ubuntu版:

sudo apt install apparmor 2.13.3-7ubuntu5.1

然後重開機:

sudo reboot

結果就可以了:

docker run --cap-add=NET_RAW --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com
PING google.com (172.217.160.110): 56 data bytes
64 bytes from 172.217.160.110: icmp_seq=0 ttl=116 time=3.514 ms
64 bytes from 172.217.160.110: icmp_seq=1 ttl=116 time=3.203 ms
64 bytes from 172.217.160.110: icmp_seq=2 ttl=116 time=3.586 ms
64 bytes from 172.217.160.110: icmp_seq=3 ttl=116 time=3.203 ms
64 bytes from 172.217.160.110: icmp_seq=4 ttl=116 time=3.125 ms
--- google.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 3.125/3.326/3.586/0.186 ms

但 podman 仍然需要 先執行 sudo sysctl -w "net.ipv4.ping_group_range=0 2000000" 否則會這樣:

podman run --cap-add=NET_RAW --dns 8.8.8.8 --rm -it adiazmor/docker-ubuntu-with-ping ping -c 5 google.com
PING google.com (142.251.42.238): 56 data bytes
--- google.com ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

由於 ping 需要開啟 net_raw 權限,但一般的網路應用並不需要該權限,所以我們可以用 wget 來測試:

docker run -it --rm alpine wget http://github.com/containers/libpod
Connecting to github.com (13.250.177.223:80)
Connecting to github.com (13.250.177.223:443)
Connecting to github.com (13.250.177.223:443)
saving to 'libpod'
libpod               100% |************************|  281k  0:00:00 ETA
'libpod' saved
podman run -it --dns 8.8.8.8 --rm alpine wget http://github.com/containers/libpod
Connecting to github.com (52.192.72.89:80)
Connecting to github.com (52.192.72.89:443)
Connecting to github.com (52.69.186.44:443)
saving to 'libpod'
libpod               100% |************************|  277k  0:00:00 ETA
'libpod' saved

使用 nslookup 查詢 DNS

docker run --rm -it tutum/dnsutils nslookup google.com
Server:		140.114.63.1
Address:	140.114.63.1#53

Non-authoritative answer:
Name:	google.com
Address: 172.217.24.14
podman run --rm -it tutum/dnsutils nslookup google.com
Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	google.com
Address: 172.217.24.14

印出 caps

docker run -it amouat/caps capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=
podman run -it amouat/caps capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=
sudo podman run -it amouat/caps capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=


小意外:

我不小心把 sudo 檔案的權限改成 777 ,結果執行 sudo <any command> 就出現

sudo: /usr/bin/sudo must be owned by uid 0 and have the setuid bit set

參考 linux下sudo顯示sudo: /usr/bin/sudo must be owned by uid 0 and have the setuid bit set - IT閱讀

想要加上 setuid,執行

chmod u+s $(which sudo)

卻得到以下錯誤訊息:

chmod: changing permissions of '/usr/bin/sudo': Operation not permitted

於是我又想到,可以去下載 sudo 的 binary 檔案來修改權限後執行,為了知道 sudo 版本,執行 apt list | grep sudo 得到:

sudo/now 1.9.5p2-3 amd64 [installed,local]
sudo/focal-updates,focal-security 1.8.31-1ubuntu1.2 i386

顯然這個 sudo 用的是 debian 版

找到debian官方套件庫 https://packages.debian.org/bullseye/sudo

下載回來後解壓縮deb檔,取出 sudo binary 再用 ssh 傳上去,但 ssh 的 user不是 root,所以傳上去的檔案仍是一般使用者權限,雖然可以設定 setuid,但卻無法 chown root:root,結果還是失敗...。

原本只剩下到機器所在地現場進入 recovery mode,然後進入 single-user root shellmount /usr/binsetuid 一途:

ubuntu - Changing ownership of ‘/usr/bin/’: Operation not permitted - Stack Overflow

If you can't gain root with plain "su" because you don't know the password or none has been set, then you have to reboot into a root shell. When you see the GRUB boot menu, press "e" to edit the kernel command lines, and append "init=/bin/sh" - then it will dump you into a single-user root shell instead of the normal boot process. Here you may have to remount the root file system read/write:

# mount / -n -w -o remount

Then you need to undo the damage from earlier:

# chown -R root /usr/bin

Then finally remount the file system read-only, sync and reboot:

# mount / -n -r -o remount
# sync
# reboot -f

root - /usr/bin/sudo must be owned by uid 0 and have the setuid bit set - Ask Ubuntu

但後來我突發奇想,能不能從 docker 掛載 volume 進去修改呢?因為我原本正在執行的 docker daemon 就是具有 root 權限,所以只要隨便啟動一個有 root user的 container,並且掛載 /usr/bin 就可以進去做修改啦!

執行

docker run --rm --privileged -v /usr/bin:/tmp/usr/bin -it adiazmor/docker-ubuntu-with-ping bash
root@24a13c2a81f8:/# cd /tmp/usr/bin/
root@24a13c2a81f8:/tmp/usr/bin# chmod u+s ./sudo

退出 container 後,再執行一次 sudo --version,登登!

Sudo version 1.9.5p2
Sudoers policy plugin version 1.9.5p2
Sudoers file grammar version 48
Sudoers I/O plugin version 1.9.5p2
Sudoers audit plugin version 1.9.5p2

可以用了!