Hi all,
when I try to upload a file (in this case a .pdf ) sometimes I receive the error
HTTP/1.1 307 Temporary Redirect
I tried to upload a .png file and an .mp4 file without errors.
Then I tried with another .pdf file and everything worked.
What can I do to avoid this error?
TL;DR
The files to be uploaded and the sha1 sums
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ ls -lh
total 25M
-rw-r--r-- 1 sysop sysop 7.3M Mar 27 2024 11537353-hd_1920_1080_30fps.mp4
-rw-r--r-- 1 sysop sysop 95 Dec 30 19:08 attachments.sql
-rwxr-xr-x 1 sysop sysop 313 Dec 31 20:38 delete
-rwxr-xr-x 1 sysop sysop 307 Dec 30 19:13 download
-rw-r--r-- 1 sysop sysop 15M Dec 31 20:23 IG24-ImpressGuide.pdf
-rw-r--r-- 1 sysop sysop 1.1K Dec 31 20:53 pdf1.txt
-rw-r--r-- 1 sysop sysop 90K Dec 31 20:54 pdf2.txt
-rw-r--r-- 1 sysop sysop 1.8M Mar 21 2024 Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf
-rw-r--r-- 1 sysop sysop 800K Mar 25 2024 ROG_CROSSHAIR_X670E_HERO_h732
-rwxr-xr-x 1 sysop sysop 361 Dec 31 19:55 upload
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ sha1sum *
135b6d193c9acba8ea180dea7424b863ce52858b 11537353-hd_1920_1080_30fps.mp4
6419dd4da55e7141eacd14b4595beb3e7a676515 attachments.sql
264bef4b65b35c8abc39037e933a49840fd2527d delete
ca3515cb69d4f770cff9afc48dda36882c1a1151 download
f83be20cb0439e5008242ec7a01cf1fe168f413b IG24-ImpressGuide.pdf
69924db7187585bbf66986e91bfe249acb28198d pdf1.txt
c42c05b32797f22c91579caac9aa1ffba7b21298 pdf2.txt
acfa51b963b5a1fb9eecaf7954ae39c9bea736cc Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf
9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 ROG_CROSSHAIR_X670E_HERO_h732
6b80836cb4b2c968a13151ef14fda06153f71e0c upload
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$
the upload script
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ cat upload
#!/bin/bash
HASH=`sha1sum $1 |sed 's/ /|/' | awk -F '|' '{print $1}'`
echo ===== HASH ========
echo $HASH
echo =
echo ===== URL =========
URL=192.168.151.21:4200/_blobs/attachments/$HASH
echo $URL
echo =
echo ===== UPLOAD ======
echo =
curl -vvvv -isSX PUT $URL --user system:S9xsloAiGVvZ7iqaDQjtAzasmiFedEYlc1ajapZWDikhPbZTkj --data-binary @$1
echo ' '
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$
The pdf file that was not uploaded
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ ./upload Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf >pdf1.txt 2>&1
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ head -n 30 pdf1.txt
===== HASH ========
acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
=
===== URL =========
192.168.151.21:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
=
===== UPLOAD ======
=
* Trying 192.168.151.21:4200...
* Connected to 192.168.151.21 (192.168.151.21) port 4200 (#0)
* Server auth using Basic with user 'system'
> PUT /_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc HTTP/1.1
> Host: 192.168.151.21:4200
> Authorization: Basic c3lzdGVtOlM5eHNsb0FpR1Z2WjdpcWFEUWp0QXphc21pRmVkRVlsYzFhamFwWldEaWtoUGJaVGtq
> User-Agent: curl/7.88.1
> Accept: */*
> Content-Length: 1883881
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 307 Temporary Redirect
< content-length: 0
< location: http://10.42.2.216:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
* HTTP error before end of send, stop sending
<
* Closing connection 0
HTTP/1.1 307 Temporary Redirect
content-length: 0
location: http://10.42.2.216:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$
the pdf file that uploaded correctly
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ ./upload IG24-ImpressGuide.pdf >pdf2.txt 2>&1
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ head -n 30 pdf2.txt
===== HASH ========
f83be20cb0439e5008242ec7a01cf1fe168f413b
=
===== URL =========
192.168.151.21:4200/_blobs/attachments/f83be20cb0439e5008242ec7a01cf1fe168f413b
=
===== UPLOAD ======
=
* Trying 192.168.151.21:4200...
* Connected to 192.168.151.21 (192.168.151.21) port 4200 (#0)
* Server auth using Basic with user 'system'
> PUT /_blobs/attachments/f83be20cb0439e5008242ec7a01cf1fe168f413b HTTP/1.1
> Host: 192.168.151.21:4200
> Authorization: Basic c3lzdGVtOlM5eHNsb0FpR1Z2WjdpcWFEUWp0QXphc21pRmVkRVlsYzFhamFwWldEaWtoUGJaVGtq
> User-Agent: curl/7.88.1
> Accept: */*
> Content-Length: 14871982
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
} [65536 bytes data]
* We are completely uploaded and fine
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
< HTTP/1.1 100 Continue
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$
The blob table content
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest | last_modified |
+------------------------------------------+---------------+
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735648816064 |
| f83be20cb0439e5008242ec7a01cf1fe168f413b | 1735674869161 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735648828388 |
+------------------------------------------+---------------+
SELECT 3 rows in set (0.010 sec)
cr>
And this is stranger: trying to delete an inexistent blob gives a 307 return code instead of 404.
The blob table contains 3 blobs
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest | last_modified |
+------------------------------------------+---------------+
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735753995280 |
| f83be20cb0439e5008242ec7a01cf1fe168f413b | 1735754021068 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735754093771 |
+------------------------------------------+---------------+
SELECT 3 rows in set (0.014 sec)
cr>
using this script to download the blob
#!/bin/bash
echo "usage: ./download [sha1sum] [file name]"
echo ===== HASH FILE ===
echo HASH $1
echo =
echo ===== URL =========
URL=192.168.151.21:4200/_blobs/attachments/$1
echo $URL
echo =
echo ===== DOWNLOAD ======
echo =
curl -vsS --output $2 $URL --user system:S9xsloAiGVvZ7iqaDQjtAzasmiFedEYlc1ajapZWDikhPbZTkj
echo ' '
Downloading gives a 307 return code
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments/downloads$ ../download acfa51b963b5a1fb9eecaf7954ae39c9bea736cc Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf
usage: ./download [sha1sum] [file name]
===== HASH FILE ===
HASH acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
=
===== URL =========
192.168.151.21:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
=
===== DOWNLOAD ======
=
* Trying 192.168.151.21:4200...
* Connected to 192.168.151.21 (192.168.151.21) port 4200 (#0)
* Server auth using Basic with user 'system'
> GET /_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc HTTP/1.1
> Host: 192.168.151.21:4200
> Authorization: Basic c3lzdGVtOlM5eHNsb0FpR1Z2WjdpcWFEUWp0QXphc21pRmVkRVlsYzFhamFwWldEaWtoUGJaVGtq
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 307 Temporary Redirect
< content-length: 0
< location: http://10.42.3.202:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc
<
* Connection #0 to host 192.168.151.21 left intact
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments/downloads$
Instead the documentation says that I should receive a 404 return code
Maybe this is due to the fact that my CrateDB is installed inside a kubernetes cluster?
sysop@h5a-dev:~$ kubectl get all -n for-crate
NAME READY STATUS RESTARTS AGE
pod/crate-data-hot-my-cluster-2 2/2 Running 0 109m
pod/crate-data-hot-my-cluster-0 2/2 Running 0 109m
pod/crate-data-hot-my-cluster-1 2/2 Running 0 109m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/crate-discovery-my-cluster ClusterIP None <none> 4300/TCP,4200/TCP,5432/TCP 27d
service/crate-my-cluster LoadBalancer 10.43.47.153 192.168.151.21,192.168.151.22,192.168.151.23 4200:31656/TCP,5432:30136/TCP 27d
NAME READY AGE
statefulset.apps/crate-data-hot-my-cluster 3/3 27d
sysop@h5a-dev:~$
Is the 307 return code masking another error in download and upload?
smu
January 2, 2025, 8:07am
3
@MirtoBusico
As also stated inside the documentation: “any successful request could lead to a 307 Temporary Redirect response.”
This is done for performance reasons: redirecting the client to the node which holds the requested blob file or is responsible for storing the uploading blob file. Blob files are shard the same way like normal records,
with the limitation that blob files as a whole are stored on a concrete node (it’s content is not chunked and split across the cluster).
To avoid streaming possible large content through the connected node (aka. handler node), we return a 307
redirect header to let the client directly connect to the node holding the file or is responsible for storing the file. This sharding logic is done using the blob’s sha1 digest digest
and the number of data nodes N
by simply running: digest % N = <TARGET_NODE>
.
For historical reasons, the check if a blob digest really exists, happens only after the redirect on the node responsible for the concrete blob file.
We could check this earlier nowadays (as we store the digests also inside a lucene index), please file a feature request on our github repository if needed.
This should not happen for DELETE
(as here no possible large data is sent), but only for GET
, PUT
or HEAD
requests.
In your example I do not see any DELETE
request being redirected. If this is really the case, this would be a bug.
1 Like
Hi @smu
you are right: delete gives 404; it is trying to download an inexistent blob that gives 307.
Now my question is how can I reliably upload and download blobs, now with test scripts and in future with an application?
Do you have any example about this?
BTW now I’m testing from a machine on the same network of the kubernetes cluster; but (when I’ll be confident that I know how it works) the access to CrateDB will be done by a POD that will connect to the internal CrateDB service. So my pod will not know which node will be used.
sysop@h5a-dev:~$ kubectl get svc -n for-crate
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
crate-discovery-my-cluster ClusterIP None <none> 4300/TCP,4200/TCP,5432/TCP 28d
crate-my-cluster LoadBalancer 10.43.47.153 192.168.151.21,192.168.151.22,192.168.151.23 4200:31656/TCP,5432:30136/TCP 28d
sysop@h5a-dev:~$
Any example on how to manage blobs in this situation?
Hi @smu ,
I noted that the redirect URL is the internal service address for CrateDB
So using a test POD, I issued the same download command for an inexistent blob and I receive a 404 return code
root@crashtest-765cc8c88d-6dnk6:/# curl -vsS --output Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf 10.42.2.14:4200/_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc --user system:S9xsloAiGVvZ7iqaDQjtAzasmiFedEYlc1ajapZWDikhPbZTkj
* Trying 10.42.2.14:4200...
* Connected to 10.42.2.14 (10.42.2.14) port 4200 (#0)
* Server auth using Basic with user 'system'
> GET /_blobs/attachments/acfa51b963b5a1fb9eecaf7954ae39c9bea736cc HTTP/1.1
> Host: 10.42.2.14:4200
> Authorization: Basic c3lzdGVtOlM5eHNsb0FpR1Z2WjdpcWFEUWp0QXphc21pRmVkRVlsYzFhamFwWldEaWtoUGJaVGtq
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 404 Not Found
< content-length: 0
< x-envoy-upstream-service-time: 0
< date: Thu, 02 Jan 2025 10:39:10 GMT
< server: envoy
<
* Connection #0 to host 10.42.2.14 left intact
root@crashtest-765cc8c88d-6dnk6:/#
Is this the correct behaviour?
smu
January 2, 2025, 1:42pm
6
Yes this is the correct behaviour. As said, only the node responsible for the concrete digest knows if the file really exists (for DELETE
we do an internal redirect).
And sorry that I didn’t mention this earlier: the 307 redirect will always return the correct URL as the Location
header value. This follows the HTTP protocol specs and many clients can use it, e.g. with curl
use the --location-trusted
(in order to also send the credentials to the redirect host) flag.
Hi @smu
it took some time; but I prepared a test application inside a pod.
Everything worked as expected.
Using crash from this pod I used the internal service address for CrateDB (10.43.47.153)
sysop@h5a-dev:~$ kubectl get svc -n for-crate
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
crate-discovery-my-cluster ClusterIP None <none> 4300/TCP,4200/TCP,5432/TCP 30d
crate-my-cluster LoadBalancer 10.43.47.153 192.168.151.21,192.168.151.22,192.168.151.23 4200:31656/TCP,5432:30136/TCP 30d
sysop@h5a-dev:~$
sha1sum for the files to be uploaded
root@crashtest-9dfdd4b8c-qlwhz:/usr/share/nginx/html/data/attachments/uploads# sha1sum *
135b6d193c9acba8ea180dea7424b863ce52858b 11537353-hd_1920_1080_30fps.mp4
f83be20cb0439e5008242ec7a01cf1fe168f413b IG24-ImpressGuide.pdf
acfa51b963b5a1fb9eecaf7954ae39c9bea736cc Q23155_ROG_MAXIMUS_Z790_HERO_BTF_QSG_WEB.pdf
9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 ROG_CROSSHAIR_X670E_HERO_h732
root@crashtest-9dfdd4b8c-qlwhz:/usr/share/nginx/html/data/attachments/uploads#
The blob table after uploading
cr> SELECT * FROM BLOB.attachments;
+------------------------------------------+---------------+
| digest | last_modified |
+------------------------------------------+---------------+
| acfa51b963b5a1fb9eecaf7954ae39c9bea736cc | 1735995025468 |
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735994937133 |
| f83be20cb0439e5008242ec7a01cf1fe168f413b | 1735994988697 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735995055568 |
+------------------------------------------+---------------+
SELECT 4 rows in set (0.014 sec)
cr>
The upload script
#!/bin/bash
HASH=`sha1sum $1 |sed 's/ /|/' | awk -F '|' '{print $1}'`
echo ===== HASH ========
echo $HASH
echo =
echo ===== URL =========
URL=10.43.47.153:4200/_blobs/attachments/$HASH
echo $URL
echo =
echo ===== UPLOAD ======
echo =
curl -v -isSX PUT $URL --user system:S9xsloAiGVvZ7iqaDQjtAzasmiFedEYlc1ajapZWDikhPbZTkj --data-binary @$1
echo ' '
1 Like